Загрузка данных в таблицу
В этом примере мы загрузим данные о сериалах из JSON-файла в заранее созданную таблицу. Для каждого сериала заданы идентификатор series_id
, название title
, а также дополнительная информация info
. Структура JSON-файла c информацией о сериале:
[{
"series_id": ...,
"title": ...,
"info": {
...
}
},
...
]
Значения series_id
и title
используются в качестве первичного ключа таблицы Series
.
Чтобы загрузить данные в таблицу Series
:
-
Создайте проект
SeriesLoadData
:mvn -B archetype:generate \ -DarchetypeGroupId=org.apache.maven.archetypes \ -DgroupId=com.mycompany.app \ -DartifactId=SeriesLoadData
В результате выполнения команды в текущем рабочем каталоге будет создан каталог проекта с именем
SeriesLoadData
, структурой подкаталогов и файлом описания проектаpom.xml
. -
Перейдите в каталог проекта:
cd SeriesLoadData
-
Отредактируйте описание проекта в файле
pom.xml
, например с помощью редактора nano:nano pom.xml
Пример файла
pom.xml
:<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.mycompany.app</groupId> <artifactId>SeriesLoadData</artifactId> <packaging>jar</packaging> <version>1.0-SNAPSHOT</version> <name>SeriesLoadData</name> <url>http://maven.apache.org</url> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-jar-plugin</artifactId> <configuration> <archive> <manifest> <addClasspath>true</addClasspath> <classpathPrefix>lib/</classpathPrefix> <mainClass>com.mycompany.app.SeriesLoadData</mainClass> </manifest> <manifestEntries> <Class-Path>.</Class-Path> </manifestEntries> </archive> <finalName>release/SeriesLoadData</finalName> </configuration> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-dependency-plugin</artifactId> <executions> <execution> <id>copy-dependencies</id> <phase>prepare-package</phase> <goals> <goal>copy-dependencies</goal> </goals> <configuration> <outputDirectory>${project.build.directory}/release/lib</outputDirectory> <overWriteReleases>false</overWriteReleases> <overWriteSnapshots>false</overWriteSnapshots> <overWriteIfNewer>true</overWriteIfNewer> </configuration> </execution> </executions> </plugin> </plugins> </build> <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>3.8.1</version> <scope>test</scope> </dependency> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk</artifactId> <version>1.11.1012</version> </dependency> </dependencies> <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> </properties> </project>
Посмотрите актуальные версии junit
и aws-java-sdk-dynamodb . -
В каталоге
src/main/java/com/mycompany/app/
создайте файлSeriesLoadData.java
, например с помощью редактора nano:nano src/main/java/com/mycompany/app/SeriesLoadData.java
Скопируйте в созданный файл следующий код:
Важно
Вместо
<Document_API_эндпоинт>
укажите подготовленное ранее значение.package com.mycompany.app; import java.io.File; import java.util.Iterator; import com.amazonaws.client.builder.AwsClientBuilder; import com.amazonaws.services.dynamodbv2.AmazonDynamoDB; import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder; import com.amazonaws.services.dynamodbv2.document.DynamoDB; import com.amazonaws.services.dynamodbv2.document.Item; import com.amazonaws.services.dynamodbv2.document.Table; import com.fasterxml.jackson.core.JsonFactory; import com.fasterxml.jackson.core.JsonParser; import com.fasterxml.jackson.databind.JsonNode; import com.fasterxml.jackson.databind.ObjectMapper; import com.fasterxml.jackson.databind.node.ObjectNode; public class SeriesLoadData { public static void main(String[] args) throws Exception { AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard() .withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration("<Document_API_эндпоинт>", "ru-central1")) .build(); DynamoDB dynamoDB = new DynamoDB(client); Table table = dynamoDB.getTable("Series"); JsonParser parser = new JsonFactory().createParser(new File("seriesdata.json")); JsonNode rootNode = new ObjectMapper().readTree(parser); Iterator<JsonNode> iter = rootNode.iterator(); ObjectNode currentNode; while (iter.hasNext()) { currentNode = (ObjectNode) iter.next(); int series_id = currentNode.path("series_id").asInt(); String title = currentNode.path("title").asText(); try { table.putItem(new Item().withPrimaryKey("series_id", series_id, "title", title).withJSON("info", currentNode.path("info").toString())); System.out.println("Добавлен сериал: " + series_id + " " + title); } catch (Exception e) { System.err.println("Невозможно загрузить данные: " + series_id + " " + title); System.err.println(e.getMessage()); break; } } parser.close(); } }
В коде используется библиотека с открытым исходным кодом Jackson для обработки JSON. Jackson включена в AWS SDK для Java.
-
Соберите проект:
mvn package
В результате выполнения команды в каталоге
target/release/
будет сгенерирован файлSeriesLoadData.jar
. -
Создайте файл
seriesdata.json
с данными для загрузки, например с помощью редактора nano:nano seriesdata.json
Скопируйте в созданный файл следующий код:
[{ "series_id": 1, "title": "IT Crowd", "info": { "release_date": "2006-02-03T00:00:00Z", "series_info": "The IT Crowd is a British sitcom produced by Channel 4, written by Graham Linehan, produced by Ash Atalla and starring Chris O'Dowd, Richard Ayoade, Katherine Parkinson, and Matt Berry" } }, { "series_id": 2, "title": "Silicon Valley", "info": { "release_date": "2014-04-06T00:00:00Z", "series_info": "Silicon Valley is an American comedy television series created by Mike Judge, John Altschuler and Dave Krinsky. The series focuses on five young men who founded a startup company in Silicon Valley" } }, { "series_id": 3, "title": "House of Cards", "info": { "release_date": "2013-02-01T00:00:00Z", "series_info": "House of Cards is an American political thriller streaming television series created by Beau Willimon. It is an adaptation of the 1990 BBC miniseries of the same name and based on the 1989 novel of the same name by Michael Dobbs" } }, { "series_id": 3, "title": "The Office", "info": { "release_date": "2005-03-24T00:00:00Z", "series_info": "The Office is an American mockumentary sitcom television series that depicts the everyday work lives of office employees in the Scranton, Pennsylvania, branch of the fictional Dunder Mifflin Paper Company" } }, { "series_id": 3, "title": "True Detective", "info": { "release_date": "2014-01-12T00:00:00Z", "series_info": "True Detective is an American anthology crime drama television series created and written by Nic Pizzolatto. The series, broadcast by the premium cable network HBO in the United States, premiered on January 12, 2014" } }, { "series_id": 4, "title": "The Big Bang Theory", "info": { "release_date": "2007-09-24T00:00:00Z", "series_info": "The Big Bang Theory is an American television sitcom created by Chuck Lorre and Bill Prady, both of whom served as executive producers on the series, along with Steven Molaro" } }, { "series_id": 5, "title": "Twin Peaks", "info": { "release_date": "1990-04-08T00:00:00Z", "series_info": "Twin Peaks is an American mystery horror drama television series created by Mark Frost and David Lynch that premiered on April 8, 1990, on ABC until its cancellation after its second season in 1991 before returning as a limited series in 2017 on Showtime" } } ]
-
Запустите приложение:
java -jar target/release/SeriesLoadData.jar
Результат:
Добавлен сериал: 1 IT Crowd Добавлен сериал: 2 Silicon Valley Добавлен сериал: 3 House of Cards Добавлен сериал: 3 The Office Добавлен сериал: 3 True Detective Добавлен сериал: 4 The Big Bang Theory Добавлен сериал: 5 Twin Peaks
-
Создайте файл
SeriesLoadData.py
, например с помощью редактора nano:nano SeriesLoadData.py
Скопируйте в созданный файл следующий код:
Важно
Вместо
<Document_API_эндпоинт>
укажите подготовленное ранее значение.from decimal import Decimal import json import boto3 def load_series(series): ydb_docapi_client = boto3.resource('dynamodb', endpoint_url = "<Document_API_эндпоинт>") table = ydb_docapi_client.Table('Series') for serie in series: series_id = int(serie['series_id']) title = serie['title'] print("Добавлен сериал:", series_id, title) table.put_item(Item = serie) if __name__ == '__main__': with open("seriesdata.json") as json_file: serie_list = json.load(json_file, parse_float = Decimal) load_series(serie_list)
-
Создайте файл
seriesdata.json
с данными для загрузки, например с помощью редактора nano:nano seriesdata.json
Скопируйте в созданный файл следующий код:
[{ "series_id": 1, "title": "IT Crowd", "info": { "release_date": "2006-02-03T00:00:00Z", "series_info": "The IT Crowd is a British sitcom produced by Channel 4, written by Graham Linehan, produced by Ash Atalla and starring Chris O'Dowd, Richard Ayoade, Katherine Parkinson, and Matt Berry" } }, { "series_id": 2, "title": "Silicon Valley", "info": { "release_date": "2014-04-06T00:00:00Z", "series_info": "Silicon Valley is an American comedy television series created by Mike Judge, John Altschuler and Dave Krinsky. The series focuses on five young men who founded a startup company in Silicon Valley" } }, { "series_id": 3, "title": "House of Cards", "info": { "release_date": "2013-02-01T00:00:00Z", "series_info": "House of Cards is an American political thriller streaming television series created by Beau Willimon. It is an adaptation of the 1990 BBC miniseries of the same name and based on the 1989 novel of the same name by Michael Dobbs" } }, { "series_id": 3, "title": "The Office", "info": { "release_date": "2005-03-24T00:00:00Z", "series_info": "The Office is an American mockumentary sitcom television series that depicts the everyday work lives of office employees in the Scranton, Pennsylvania, branch of the fictional Dunder Mifflin Paper Company" } }, { "series_id": 3, "title": "True Detective", "info": { "release_date": "2014-01-12T00:00:00Z", "series_info": "True Detective is an American anthology crime drama television series created and written by Nic Pizzolatto. The series, broadcast by the premium cable network HBO in the United States, premiered on January 12, 2014" } }, { "series_id": 4, "title": "The Big Bang Theory", "info": { "release_date": "2007-09-24T00:00:00Z", "series_info": "The Big Bang Theory is an American television sitcom created by Chuck Lorre and Bill Prady, both of whom served as executive producers on the series, along with Steven Molaro" } }, { "series_id": 5, "title": "Twin Peaks", "info": { "release_date": "1990-04-08T00:00:00Z", "series_info": "Twin Peaks is an American mystery horror drama television series created by Mark Frost and David Lynch that premiered on April 8, 1990, on ABC until its cancellation after its second season in 1991 before returning as a limited series in 2017 on Showtime" } } ]
-
Запустите программу:
python SeriesLoadData.py
Результат:
Добавлен сериал: 1 IT Crowd Добавлен сериал: 2 Silicon Valley Добавлен сериал: 3 House of Cards Добавлен сериал: 3 The Office Добавлен сериал: 3 True Detective Добавлен сериал: 4 The Big Bang Theory Добавлен сериал: 5 Twin Peaks
-
Создайте файл
SeriesLoadData.php
, например с помощью редактора nano:nano SeriesLoadData.php
Скопируйте в созданный файл следующий код:
Важно
Вместо
<Document_API_эндпоинт>
укажите подготовленное ранее значение.<?php require 'vendor/autoload.php'; date_default_timezone_set('UTC'); use Aws\DynamoDb\Exception\DynamoDbException; use Aws\DynamoDb\Marshaler; $sdk = new Aws\Sdk([ 'endpoint' => '<Document_API_эндпоинт>', 'region' => 'ru-central1', 'version' => 'latest' ]); $dynamodb = $sdk->createDynamoDb(); $marshaler = new Marshaler(); $tableName = 'Series'; $Series = json_decode(file_get_contents('seriesdata.json'), true); foreach ($Series as $movie) { $series_id = $movie['series_id']; $title = $movie['title']; $info = $movie['info']; $json = json_encode([ 'series_id' => $series_id, 'title' => $title, 'info' => $info ]); $params = [ 'TableName' => $tableName, 'Item' => $marshaler->marshalJson($json) ]; try { $result = $dynamodb->putItem($params); echo "Добавлен сериал: " . $movie['series_id'] . " " . $movie['title'] . "\n"; } catch (DynamoDbException $e) { echo "Невозможно загрузить данные:\n"; echo $e->getMessage() . "\n"; break; } } ?>
Класс
Marshaler
включает методы преобразования JSON-документов и PHP-массивов в формат YDB. В этом коде$marshaler->marshalJson($json)
принимает данные в формате JSON и преобразует их в запись YDB. -
Создайте файл
seriesdata.json
с данными для загрузки, например с помощью редактора nano:nano seriesdata.json
Скопируйте в созданный файл следующий код:
[{ "series_id": 1, "title": "IT Crowd", "info": { "release_date": "2006-02-03T00:00:00Z", "series_info": "The IT Crowd is a British sitcom produced by Channel 4, written by Graham Linehan, produced by Ash Atalla and starring Chris O'Dowd, Richard Ayoade, Katherine Parkinson, and Matt Berry" } }, { "series_id": 2, "title": "Silicon Valley", "info": { "release_date": "2014-04-06T00:00:00Z", "series_info": "Silicon Valley is an American comedy television series created by Mike Judge, John Altschuler and Dave Krinsky. The series focuses on five young men who founded a startup company in Silicon Valley" } }, { "series_id": 3, "title": "House of Cards", "info": { "release_date": "2013-02-01T00:00:00Z", "series_info": "House of Cards is an American political thriller streaming television series created by Beau Willimon. It is an adaptation of the 1990 BBC miniseries of the same name and based on the 1989 novel of the same name by Michael Dobbs" } }, { "series_id": 3, "title": "The Office", "info": { "release_date": "2005-03-24T00:00:00Z", "series_info": "The Office is an American mockumentary sitcom television series that depicts the everyday work lives of office employees in the Scranton, Pennsylvania, branch of the fictional Dunder Mifflin Paper Company" } }, { "series_id": 3, "title": "True Detective", "info": { "release_date": "2014-01-12T00:00:00Z", "series_info": "True Detective is an American anthology crime drama television series created and written by Nic Pizzolatto. The series, broadcast by the premium cable network HBO in the United States, premiered on January 12, 2014" } }, { "series_id": 4, "title": "The Big Bang Theory", "info": { "release_date": "2007-09-24T00:00:00Z", "series_info": "The Big Bang Theory is an American television sitcom created by Chuck Lorre and Bill Prady, both of whom served as executive producers on the series, along with Steven Molaro" } }, { "series_id": 5, "title": "Twin Peaks", "info": { "release_date": "1990-04-08T00:00:00Z", "series_info": "Twin Peaks is an American mystery horror drama television series created by Mark Frost and David Lynch that premiered on April 8, 1990, on ABC until its cancellation after its second season in 1991 before returning as a limited series in 2017 on Showtime" } } ]
-
Запустите программу:
php SeriesLoadData.php
Результат:
Добавлен сериал: 1 IT Crowd Добавлен сериал: 2 Silicon Valley Добавлен сериал: 3 House of Cards Добавлен сериал: 3 The Office Добавлен сериал: 3 True Detective Добавлен сериал: 4 The Big Bang Theory Добавлен сериал: 5 Twin Peaks
-
Создайте файл
SeriesLoadData.js
, например с помощью редактора nano:nano SeriesLoadData.js
Скопируйте в созданный файл следующий код:
Важно
Вместо
<Document_API_эндпоинт>
укажите подготовленное ранее значение.const AWS = require("@aws-sdk/client-dynamodb"); const fs = require("fs") // Credentials should be defined via environment variables AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID const dynamodb = new AWS.DynamoDBClient({ region: "ru-central1", endpoint: "<Document_API_эндпоинт>", }); console.log("Загрузка сериалов в YDB. Пожалуйста, подождите..."); const allSeries = JSON.parse(fs.readFileSync('seriesdata.json', 'utf8')); allSeries.forEach(function(series) { dynamodb.send(new AWS.PutItemCommand({ TableName: "Series", Item: { "series_id": series.series_id, "title": series.title, "info": series.info } })) .then(() => { console.log("Добавлен сериал:", series.title); }) .catch(err => { console.error("Невозможно добавить сериал", series.title, ". Error JSON:", JSON.stringify(err, null, 2)); }) });
-
Создайте файл
seriesdata.json
с данными для загрузки, например с помощью редактора nano:nano seriesdata.json
Скопируйте в созданный файл следующий код:
[{ "series_id": 1, "title": "IT Crowd", "info": { "release_date": "2006-02-03T00:00:00Z", "series_info": "The IT Crowd is a British sitcom produced by Channel 4, written by Graham Linehan, produced by Ash Atalla and starring Chris O'Dowd, Richard Ayoade, Katherine Parkinson, and Matt Berry" } }, { "series_id": 2, "title": "Silicon Valley", "info": { "release_date": "2014-04-06T00:00:00Z", "series_info": "Silicon Valley is an American comedy television series created by Mike Judge, John Altschuler and Dave Krinsky. The series focuses on five young men who founded a startup company in Silicon Valley" } }, { "series_id": 3, "title": "House of Cards", "info": { "release_date": "2013-02-01T00:00:00Z", "series_info": "House of Cards is an American political thriller streaming television series created by Beau Willimon. It is an adaptation of the 1990 BBC miniseries of the same name and based on the 1989 novel of the same name by Michael Dobbs" } }, { "series_id": 3, "title": "The Office", "info": { "release_date": "2005-03-24T00:00:00Z", "series_info": "The Office is an American mockumentary sitcom television series that depicts the everyday work lives of office employees in the Scranton, Pennsylvania, branch of the fictional Dunder Mifflin Paper Company" } }, { "series_id": 3, "title": "True Detective", "info": { "release_date": "2014-01-12T00:00:00Z", "series_info": "True Detective is an American anthology crime drama television series created and written by Nic Pizzolatto. The series, broadcast by the premium cable network HBO in the United States, premiered on January 12, 2014" } }, { "series_id": 4, "title": "The Big Bang Theory", "info": { "release_date": "2007-09-24T00:00:00Z", "series_info": "The Big Bang Theory is an American television sitcom created by Chuck Lorre and Bill Prady, both of whom served as executive producers on the series, along with Steven Molaro" } }, { "series_id": 5, "title": "Twin Peaks", "info": { "release_date": "1990-04-08T00:00:00Z", "series_info": "Twin Peaks is an American mystery horror drama television series created by Mark Frost and David Lynch that premiered on April 8, 1990, on ABC until its cancellation after its second season in 1991 before returning as a limited series in 2017 on Showtime" } } ]
-
Запустите программу:
node SeriesLoadData.js
Результат:
Загрузка сериалов в YDB. Пожалуйста, подождите... Добавлен сериал: The Office Добавлен сериал: IT Crowd Добавлен сериал: House of Cards Добавлен сериал: The Big Bang Theory Добавлен сериал: Twin Peaks Добавлен сериал: Silicon Valley Добавлен сериал: True Detective
-
Создайте файл
SeriesLoadData.rb
, например с помощью редактора nano:nano SeriesLoadData.rb
Скопируйте в созданный файл следующий код:
Важно
Вместо
<Document_API_эндпоинт>
укажите подготовленное ранее значение.require 'aws-sdk-dynamodb' require 'json' $series_counter = 0 $total_series = 0 def add_item_to_table(dynamodb_client, table_item) dynamodb_client.put_item(table_item) $series_counter += 1 puts "Загрузка сериала #{$series_counter}/#{$total_series}: " \ "'#{table_item[:item]['title']} " \ "(#{table_item[:item]['series_id']})'." rescue StandardError => e puts "Ошибка загрузки сериала '#{table_item[:item]['title']} " \ "(#{table_item[:item]['series_id']})': #{e.message}" puts 'Программа остановлена.' exit 1 end def run_me region = 'ru-central1' table_name = 'Series' data_file = 'seriesdata.json' Aws.config.update( endpoint: '<Document_API_эндпоинт>', region: region ) dynamodb_client = Aws::DynamoDB::Client.new file = File.read(data_file) series = JSON.parse(file) $total_series = series.count puts "Будет загружено #{$total_series} сериалов из файла '#{data_file}' " \ "в таблицу '#{table_name}'..." series.each do |seria| table_item = { table_name: table_name, item: seria } add_item_to_table(dynamodb_client, table_item) end puts 'Загрузка успешно выполнена.' end run_me if $PROGRAM_NAME == __FILE__
-
Создайте файл
seriesdata.json
с данными для загрузки, например с помощью редактора nano:nano seriesdata.json
Скопируйте в созданный файл следующий код:
[{ "series_id": 1, "title": "IT Crowd", "info": { "release_date": "2006-02-03T00:00:00Z", "series_info": "The IT Crowd is a British sitcom produced by Channel 4, written by Graham Linehan, produced by Ash Atalla and starring Chris O'Dowd, Richard Ayoade, Katherine Parkinson, and Matt Berry" } }, { "series_id": 2, "title": "Silicon Valley", "info": { "release_date": "2014-04-06T00:00:00Z", "series_info": "Silicon Valley is an American comedy television series created by Mike Judge, John Altschuler and Dave Krinsky. The series focuses on five young men who founded a startup company in Silicon Valley" } }, { "series_id": 3, "title": "House of Cards", "info": { "release_date": "2013-02-01T00:00:00Z", "series_info": "House of Cards is an American political thriller streaming television series created by Beau Willimon. It is an adaptation of the 1990 BBC miniseries of the same name and based on the 1989 novel of the same name by Michael Dobbs" } }, { "series_id": 3, "title": "The Office", "info": { "release_date": "2005-03-24T00:00:00Z", "series_info": "The Office is an American mockumentary sitcom television series that depicts the everyday work lives of office employees in the Scranton, Pennsylvania, branch of the fictional Dunder Mifflin Paper Company" } }, { "series_id": 3, "title": "True Detective", "info": { "release_date": "2014-01-12T00:00:00Z", "series_info": "True Detective is an American anthology crime drama television series created and written by Nic Pizzolatto. The series, broadcast by the premium cable network HBO in the United States, premiered on January 12, 2014" } }, { "series_id": 4, "title": "The Big Bang Theory", "info": { "release_date": "2007-09-24T00:00:00Z", "series_info": "The Big Bang Theory is an American television sitcom created by Chuck Lorre and Bill Prady, both of whom served as executive producers on the series, along with Steven Molaro" } }, { "series_id": 5, "title": "Twin Peaks", "info": { "release_date": "1990-04-08T00:00:00Z", "series_info": "Twin Peaks is an American mystery horror drama television series created by Mark Frost and David Lynch that premiered on April 8, 1990, on ABC until its cancellation after its second season in 1991 before returning as a limited series in 2017 on Showtime" } } ]
-
Запустите программу:
ruby SeriesLoadData.rb
Результат:
Будет загружено 7 сериалов из файла 'seriesdata.json' в таблицу 'Series'... Загрузка сериала 1/7: 'IT Crowd (1)'. Загрузка сериала 2/7: 'Silicon Valley (2)'. Загрузка сериала 3/7: 'House of Cards (3)'. Загрузка сериала 4/7: 'The Office (3)'. Загрузка сериала 5/7: 'True Detective (3)'. Загрузка сериала 6/7: 'The Big Bang Theory (4)'. Загрузка сериала 7/7: 'Twin Peaks (5)'. Загрузка успешно выполнена.