{ "cells": [ { "metadata": {}, "cell_type": "markdown", "source": [ "# MinIO\n", "\n", "[MinIO](https://min.io/) is a high performance, distributed object storage system. It is built for large scale AI/ML, data lake and database workloads. It is software-defined, runs on any cloud or on-premises infrastructure and is 100% open source under the Apache V2 license. \n", "MinIO is designed to be used as a private cloud storage, to store and share files and data. It is Amazon S3 compatible.\n", "\n", "In this tutorial we will see how to use MinIO with [Python](https://min.io/docs/minio/linux/developers/python/API.html) and with the MinIO client CLI tool ([mc](https://min.io/docs/minio/linux/reference/minio-mc.html))." ], "id": "3da60b1dd7781625" }, { "metadata": {}, "cell_type": "markdown", "source": "## MinIO Python API", "id": "aef5bb43fc1bf5d1" }, { "metadata": {}, "cell_type": "code", "source": "!pip install minio", "id": "2e0582a0ea7b18f4", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": [ "from minio import Minio\n", "import os" ], "id": "8acc3109-3d3e-4ec1-a8fe-e3de2a8a17c9", "outputs": [], "execution_count": null }, { "cell_type": "code", "id": "b807f0a3-5970-456a-a24c-67e77af4ea09", "metadata": {}, "source": [ "client = Minio(\"minio:80\",\n", " access_key=os.environ[\"MINIO_ACCESS_KEY\"],\n", " secret_key=os.environ[\"MINIO_SECRET_KEY\"],\n", " secure=False)" ], "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## List buckets", "id": "84c35389dc3af2e3" }, { "cell_type": "code", "id": "b5a41424-04c6-4015-8f61-f9b5f1f93239", "metadata": {}, "source": [ "client.list_buckets()" ], "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## List Objects", "id": "e72d2465a1a39eda" }, { "metadata": {}, "cell_type": "code", "source": [ "# List objects information.\n", "objects = client.list_objects(\"my-bucket\")\n", "for obj in objects:\n", " print(obj)" ], "id": "c783256ecd9c5fbd", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Create a bucket", "id": "52764415edc68319" }, { "cell_type": "code", "id": "2e5954db-dab5-4444-8da1-4ab4b7f1880d", "metadata": {}, "source": "client.make_bucket(\"BUCKET_NAME\")", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Copy a file to MinIO", "id": "3730277a262a3090" }, { "metadata": {}, "cell_type": "code", "source": "client.fput_object(\"BUCKET_NAME\", \"BUCKET_PATH\", \"LOCAL_FILE\")", "id": "41147d0d29e67c10", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Copy a folder to MinIO", "id": "66c4ee0ee39b05f3" }, { "metadata": {}, "cell_type": "code", "source": [ "import glob\n", "\n", "def upload_local_directory_to_minio(local_path, bucket_name, minio_path):\n", " assert os.path.isdir(local_path)\n", "\n", " for local_file in glob.glob(local_path + '/**'):\n", " local_file = local_file.replace(os.sep, \"/\") # Replace \\ with / on Windows\n", " if not os.path.isfile(local_file):\n", " upload_local_directory_to_minio(\n", " local_file, bucket_name, minio_path + \"/\" + os.path.basename(local_file))\n", " else:\n", " remote_path = os.path.join(\n", " minio_path, local_file[1 + len(local_path):])\n", " remote_path = remote_path.replace(\n", " os.sep, \"/\") # Replace \\ with / on Windows\n", " client.fput_object(bucket_name, remote_path, local_file)" ], "id": "f2b1a9ebefecd27", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": "upload_local_directory_to_minio(\"LOCAL_FOLDER\",\"BUCKET\",\"BUCKET_PATH\")", "id": "bb8cb1a2a632a9c1", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Copy a file from MinIO", "id": "bb1ac69dd0647008" }, { "metadata": {}, "cell_type": "code", "source": "client.fget_object(\"BUCKET_NAME\", \"FILENAME\", \"LOCAL_FILE\")", "id": "79d6fe5db820b030", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Copy a folder from MinIO", "id": "3a83d39e6625257f" }, { "metadata": {}, "cell_type": "code", "source": [ "def download_minio_directory_to_local(minio_path, local_path, bucket_name):\n", " objects = client.list_objects(bucket_name, prefix=minio_path, recursive=True)\n", " for obj in objects:\n", " remote_path = obj.object_name\n", " local_file = os.path.join(local_path, remote_path[len(minio_path):])\n", " local_file = local_file.replace(os.sep, \"/\") # Replace \\ with / on Windows\n", " local_dir = os.path.dirname(local_file)\n", " if not os.path.exists(local_dir):\n", " os.makedirs(local_dir)\n", " client.fget_object(bucket_name, remote_path, local_file)" ], "id": "1274083acc1b14a1", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "# MinIO CLI", "id": "aaf3f9930131ae44" }, { "metadata": {}, "cell_type": "code", "source": [ "%%bash\n", "\n", "curl https://dl.min.io/client/mc/release/linux-amd64/mc \\\n", " --create-dirs \\\n", " -o $HOME/minio-binaries/mc\n", "\n", "chmod +x $HOME/minio-binaries/mc\n", "export PATH=$PATH:$HOME/minio-binaries/" ], "id": "3c4b63d2292722f2", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": "!mc alias set minio http://minio:80 $MINIO_ACCESS_KEY $MINIO_SECRET_KEY", "id": "93c7c959fe6588d6", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## List buckets", "id": "b3602f5c3c34aa55" }, { "metadata": {}, "cell_type": "code", "source": "!mc ls minio", "id": "3ae49dd634287361", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Create a bucket", "id": "9ee077d054f268ce" }, { "metadata": {}, "cell_type": "code", "source": "!mc mb minio/BUCKET_NAME", "id": "d2d96a1da6daeae1", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Copy a file or folder to MinIO", "id": "797616b15119f77b" }, { "metadata": {}, "cell_type": "code", "source": "!mc cp [--recursive] LOCAL_FILE minio/BUCKET_NAME/BUCKET_PATH", "id": "c8f276231e373c87", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Copy a file or folder from MinIO", "id": "b3497f0cafaa32be" }, { "metadata": {}, "cell_type": "code", "source": "!mc cp [--recursive] minio/BUCKET_NAME/BUCKET_PATH LOCAL_FILE", "id": "4c2e8461cb24bd53", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Sync a folder to MinIO", "id": "e71635f1116ed43c" }, { "metadata": {}, "cell_type": "code", "source": "!mc mirror [--watch] minio/BUCKET LOCAL_FOLDER", "id": "22ad702ee8d004f4", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "markdown", "source": "## Create a link to externally upload files to MinIO", "id": "a95629d5de7f9ac0" }, { "metadata": {}, "cell_type": "code", "source": "!mc share upload --recursive --expire TIME minio/BUCKET", "id": "8a6385a10464f27b", "outputs": [], "execution_count": null }, { "metadata": {}, "cell_type": "code", "source": [ "from pathlib import Path\n", "import os\n", "from tqdm import tqdm\n", "import subprocess\n", "\n", "\n", "\n", "if __name__ == \"__main__\":\n", " rootdir = 'LOCAL_PATH'\n", " bucket_path = 'BUCKET_PATH'\n", " curl_command = \"curl /\"\n", " curl_command += \" \"\n", " main_curl_command = curl_command.split(\" \")\n", " for subdir, dirs, files in os.walk(rootdir):\n", " for file in tqdm(files):\n", " curl_command = main_curl_command.copy()\n", " curl_command.append(\"-F\")\n", " curl_command.append(f\"key={bucket_path}/{Path(subdir).relative_to(rootdir).joinpath(file)}\")\n", " curl_command.append(\"-F\")\n", " curl_command.append(f\"file=@{Path(subdir).joinpath( file)}\")\n", " \n", " subprocess.call(curl_command)" ], "id": "db7ae55d9e8fc517", "outputs": [], "execution_count": null } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.19" } }, "nbformat": 4, "nbformat_minor": 5 }