Load external data: Google Drive, Sheets, and Cloud Storage

Ying-Ting 2018/04/18

a. Load data from local file system

Uploading files from the computer local file system

files.uploadreturn a dictionary of the files which were uploaded. The dictionary is keyed by the file name, the value is the data which was uploaded.

from
 google.colab 
import
 files

uploaded = files.upload()


for
 fn 
in
 uploaded.keys():
    print(
'User uploaded file "{name}" with length {length} bytes'
.format(name=fn, length=len(uploaded[fn])))

執行結果：files.upload() 語法提供上傳檔案到 Colab 的執行環境中

Downloading files to the computer local file system

files.downloadwill invoke a browser downloaded of the file to user's local computer.

from
 google.colab 
import
 files


with
 open(
'example.txt'
, 
'w'
) 
as
 f:
    f.write(
'some content'
)

files.download(
'example.txt'
)

執行結果：執行過程中開啟 "example.txt" 文件並寫入一行字 "some content"。最後將檔案下載到自己的電腦中。

b. Load data from Google Drive

Access files in Google Drive using thenative REST APIor a wrapper likePyDrive.

PyDrive

The example below shows #1 authenticate, #2 file upload, and #3 file download from Google Drive. More examples are available in thePyDrive documentation

!pip install -U -q PyDrive


from
 pydrive.auth 
import
 GoogleAuth

from
 pydrive.drive 
import
 GoogleDrive

from
 google.colab 
import
 auth

from
 oauth2client.client 
import
 GoogleCredentials


# 1. Authenticate and create the PyDrive client.

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)


# PyDrive reference:
# https://googledrive.github.io/PyDrive/docs/build/html/index.html
# 2. Create 
&
 upload a file text file.

uploaded = drive.CreateFile({
'title'
: 
'Sample upload.txt'
})
uploaded.SetContentString(
'Sample upload file content'
)
uploaded.Upload()
print(
'Uploaded file with ID {}'
.format(uploaded.get(
'id'
)))


# 3. Load a file by ID and print its contents.

downloaded = drive.CreateFile({
'id'
: uploaded.get(
'id'
)})
print(
'Downloaded content "{}"'
.format(downloaded.GetContentString()))

執行結果：執行過程會選擇要使用的 Google 帳戶，根據引導取得該帳戶的認證碼，貼回程式下方

執行結果：在 Google Drive 產生一個新檔案 "Sample upload.txt"

執行結果："Sample upload.txt" 檔案在 Google Drive 中

Drive REST API

Authentication is the first step

from
 google.colab 
import
 auth
auth.authenticate_user()

Then construct a Drive API client.

from
 googleapiclient.discovery 
import
 build
drive_service = build(
'drive'
, 
'v3'
)

When client created, can use any of functions in theGoogle Drive API reference.

Create a new Google Drive file with data from Python

# Create a local file.
with
 open(
'/tmp/to_upload.txt'
, 
'w'
) 
as
 f:
  f.write(
'my sample file'
)


# print out the file content

print(
'/tmp/to_upload.txt contains:'
)
!cat /tmp/to_upload.txt

執行結果：

After executing the code above, a new file named "Sample file" will appear indrive.google.comfile list.

from
 googleapiclient.discovery 
import
 build
drive_service = build(
'drive'
, 
'v3'
)


# Upload the file to Drive. See:
#
# https://developers.google.com/drive/v3/reference/files/create
# https://developers.google.com/drive/v3/web/manage-uploads
from
 googleapiclient.http 
import
 MediaFileUpload

file_metadata = {

'name'
: 
'Sample file'
,

'mimeType'
: 
'text/plain'

}
media = MediaFileUpload(
'/tmp/to_upload.txt'
, 
                        mimetype=
'text/plain'
,
                        resumable=
True
)
created = drive_service.files().create(body=file_metadata,
                                       media_body=media,
                                       fields=
'id'
).execute()
print(
'File ID: {}'
.format(created.get(
'id'
)))

The file ID in execution result will different with the example code above.

執行結果：

Downloading data from a Google Drive file into Python

Before execute the code below, change thefile_idfield by id with the file in Google Drive.

# Download the file we just uploaded.
#
# Replace the assignment below with your file ID
# to download a different file.
#
# A file ID looks like: 1uBtlaggVyWshwcyP6kEI-y_W3P8D26sz

file_id = 
'target_file_id'
import
 io

from
 googleapiclient.http 
import
 MediaIoBaseDownload

request = drive_service.files().get_media(fileId=file_id)
downloaded = io.BytesIO()
downloader = MediaIoBaseDownload(downloaded, request)
done = 
False
while
 done 
is
False
:

# _ is a placeholder for a progress object that we ignore.
# (Our file is small, so we skip reporting progress.)

  _, done = downloader.next_chunk()

downloaded.seek(
0
)
print(
'Downloaded file contents are: {}'
.format(downloaded.read()))

執行結果：

c. Google Sheets

d. Google Cloud Storages (GCS)

Reference

[0]https://colab.research.google.com/notebook#fileId=/v2/external/notebooks/io.ipynb&scrollTo=KHeruhacFpSU

Load external data: Google Drive, Sheets, and Cloud Storage

Load external data: Google Drive, Sheets, and Cloud Storage

a. Load data from local file system

Uploading files from the computer local file system

Downloading files to the computer local file system

b. Load data from Google Drive

PyDrive

Drive REST API

Create a new Google Drive file with data from Python

Downloading data from a Google Drive file into Python

c. Google Sheets

d. Google Cloud Storages (GCS)

results matching ""

No results matching ""