CBOW-READ FILE
下列程式示範如何透過網址下載檔案
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import collections
import math
import os
import random
import zipfile
import numpy as np
from six.moves import urllib
from six.moves import xrange # pylint: disable=redefined-builtin
import tensorflow as tf
# Step 1: Download the data.
1. url = 'http://mattmahoney.net/dc/'
#檔案名稱,檔案大小
2. def maybe_download(filename, expected_bytes):
"""Download a file if not present, and make sure it's the right size."""
3. if not os.path.exists(filename):
4. filename, _ = urllib.request.urlretrieve(url + filename, filename)
5. statinfo = os.stat(filename)
6. if statinfo.st_size == expected_bytes:
7. print('Found and verified', filename)
8. else:
9. print(statinfo.st_size)
10. raise Exception('Failed to verify ' + filename + '. Can you get to it with a browser?')
11. return filename
12. filename = maybe_download('text8.zip', 31344016)
# Read the data into a list of strings.
13. def read_data(filename):
"""Extract the first file enclosed in a zip file as a list of words"""
14. with zipfile.ZipFile(filename) as f:
15. data = tf.compat.as_str(f.read(f.namelist()[0])).split()
16. return data
17. words = read_data(filename)
18. print('Data size', len(words))
執行結果得到檔案名稱text8.zip和檔案大小:17005207
Found and verified text8.zip
Data size 17005207
如上圖所示讀取檔案時:會依據檔案型式,檔案來源和檔案路徑來撰寫相對應的方程式。本程式示範在網路上讀取ZIP檔
import Os模塊是Python基礎模塊之一,其提供了很多與作業系統交互的api。例如,獲取作業系統的變量:
import urllib可以存取網頁、下載資料、剖析資料、修改表頭(header)、執行GET與POST的請求
第10行: raise可以拋出異常,表示檔案有誤。其用法如下:
例子:如果輸入的數據不是整數,則引發一個ValueError
inputValue=input\("please input a int data :"\)
if type\(inputValue\)!=type\(1\):
raise ValueError
else:
print inputValue
Python可用下列方式讀檔,一次讀取一行資料直到檔案結尾
with open(filename.txt,'r') as fp:
all_lines = fp.readlines()
第17行監看words得執行結果
[1]檔案路徑:https://www.itread01.com/content/1503829342.html
[2]colab讀取.csv檔方式:https://ithelp.ithome.com.tw/questions/10189177
[3]python 讀取檔案的4個方法: https://www.phpini.com/perl/4-ways-write-file-line-by-line-in-python