JSON
JSON[1]是Javascript Object Notation的縮寫,副檔名是.json。該語言以易於讓人閱讀的文字為基礎,用來傳輸由屬性值或者由序列性的值組成的資料物件。儘管JSON是Javascript的一個子集,但是JSON是獨立於語言的文字格式,目前很多程式語言都支援JSON格式資料的生成和解析。
資料結構
JSON用於描述資料結構,有兩種結構存在
- 物件(object)
 - 陣列(array)
 
物件格式
一個物件包含一系列非排序的名稱/值對(pair),一個物件以{開始,並以}結束。每個名稱 / 值 (pair), 名稱和值使用:隔開,一般的形式是
{name:value}
名稱是一個字串,值可以是字串、數值(number)、物件(object)、布林值(bool)、有序列表(array)、null值。
- 字串 以
" "括起來的一串字元。 - 數值 一系列0-9的數字組合,可以為負數或小數,還可以用
e或E表示為指數形式。 - 布林值 為
true或false。 null特殊值代表空物件。
陣列格式
一個陣列是值(value)的集合,一個陣列以[開始,並以]結束。陣列成員之間使用,分割。一般的形式是
[value, ..., value]
qa.json
qidA unique id for every question. train, val, test sets are also indicated.imdb_keyThe movie this question belongs to.questionThe question string.answersThe five answers options.correct_indexCorrect answer option.plot_alignmensplit_plot file line numbers, to which this question corresponds.video_clipsClips that are aligned with the question, to be used for answering.
This is json content.
[
    {
        "qid": "train:53",
        "question": "Why is Cameron upset with Bianca after the party?",
        "answers": [
            "He is upset that she likes Patrick instead of him",
            "He doesn't like the way she has been treating him since she started seeing Patrick",
            "He doesn't like the way she has been treating him since she started seeing Joey",
            "He doesn't like the way she has been treating Kat",
            "He is upset that she was drinking at the party"
        ],
        "imdb_key": "tt0147800",
        "correct_index": 2,
        "plot_alignment": [
            17,
            19
        ],
        "video_clips": [
            "tt0147800.sf-064241.ef-065690.video.mp4",
            "tt0147800.sf-065691.ef-065865.video.mp4",
            "tt0147800.sf-076586.ef-076829.video.mp4",
            "tt0147800.sf-076830.ef-078533.video.mp4",
            "tt0147800.sf-078534.ef-078607.video.mp4"
        ]
    }
]
movies.json
imdb_keyA unique id for every movie, which corresponds to IMDB number.nameMovie titleyearrelease yeargenregenere which that movie belongstexttext description of that movie.
This is json content.
[
  {
    "genre": "Comedy, Drama, Romance", 
    "text": {
      "plot": "story/plot/tt1022603.wiki", 
      "subtitle": "story/subtt/tt1022603.srt", 
      "dvs": null, 
      "script": "story/script/tt1022603.script"
    }, 
    "imdb_key": "tt1022603", 
    "name": "(500) Days of Summer", 
    "year": "2009"
  }, 
]
splits.json
train, val, test imdb_key
{
  "test": [
    "tt0113243", 
    "tt0110950", 
    "tt0151804", 
    "tt0179098", 
    ...
  ], 
  "train": [
    "tt0171433", 
    "tt1981115", 
    "tt0343660", 
    "tt1401152", 
    ...
  ], 
  "val": [
    "tt0822832", 
    "tt0087892", 
    "tt0363771", 
    "tt0034583", 
    ...
  ]
}