Uploading data to a table
In this example, we will upload information about various TV series from a JSON file to a pre-built table. For each series, the series_id
ID, title
, and additional info
are set. The structure of a JSON file with series information:
[{
"series_id": ...,
"title": ...,
"info": {
...
}
},
...
]
The series_id
and title
values are used as the primary key of the Series
table.
To upload data to the Series
table:
-
Create the
SeriesLoadData
project:mvn -B archetype:generate \ -DarchetypeGroupId=org.apache.maven.archetypes \ -DgroupId=com.mycompany.app \ -DartifactId=SeriesLoadData
As a result of running the command, the system will create the
SeriesLoadData
project folder in the current working folder, with a subfolder structure and thepom.xml
project description file. -
Go to the project folder:
cd SeriesLoadData
-
Edit the project description in the
pom.xml
file, for example, using thenano
editor:nano pom.xml
Sample
pom.xml
file:<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.mycompany.app</groupId> <artifactId>SeriesLoadData</artifactId> <packaging>jar</packaging> <version>1.0-SNAPSHOT</version> <name>SeriesLoadData</name> <url>http://maven.apache.org</url> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-jar-plugin</artifactId> <configuration> <archive> <manifest> <addClasspath>true</addClasspath> <classpathPrefix>lib/</classpathPrefix> <mainClass>com.mycompany.app.SeriesLoadData</mainClass> </manifest> <manifestEntries> <Class-Path>.</Class-Path> </manifestEntries> </archive> <finalName>release/SeriesLoadData</finalName> </configuration> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-dependency-plugin</artifactId> <executions> <execution> <id>copy-dependencies</id> <phase>prepare-package</phase> <goals> <goal>copy-dependencies</goal> </goals> <configuration> <outputDirectory>${project.build.directory}/release/lib</outputDirectory> <overWriteReleases>false</overWriteReleases> <overWriteSnapshots>false</overWriteSnapshots> <overWriteIfNewer>true</overWriteIfNewer> </configuration> </execution> </executions> </plugin> </plugins> </build> <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>3.8.1</version> <scope>test</scope> </dependency> <dependency> <groupId>com.amazonaws</groupId> <artifactId>aws-java-sdk</artifactId> <version>1.11.1012</version> </dependency> </dependencies> <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> </properties> </project>
Check the current versions of junit
and aws-java-sdk-dynamodb . -
In the
src/main/java/com/mycompany/app/
folder, create theSeriesLoadData.java
file, for example, using thenano
editor:nano src/main/java/com/mycompany/app/SeriesLoadData.java
Copy the following code to the created file:
Warning
For
<Document_API_endpoint>
, provide the prepared value.package com.mycompany.app; import java.io.File; import java.util.Iterator; import com.amazonaws.client.builder.AwsClientBuilder; import com.amazonaws.services.dynamodbv2.AmazonDynamoDB; import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder; import com.amazonaws.services.dynamodbv2.document.DynamoDB; import com.amazonaws.services.dynamodbv2.document.Item; import com.amazonaws.services.dynamodbv2.document.Table; import com.fasterxml.jackson.core.JsonFactory; import com.fasterxml.jackson.core.JsonParser; import com.fasterxml.jackson.databind.JsonNode; import com.fasterxml.jackson.databind.ObjectMapper; import com.fasterxml.jackson.databind.node.ObjectNode; public class SeriesLoadData { public static void main(String[] args) throws Exception { AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard() .withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration("<Document_API_endpoint>", "ru-central1")) .build(); DynamoDB dynamoDB = new DynamoDB(client); Table table = dynamoDB.getTable("Series"); JsonParser parser = new JsonFactory().createParser(new File("seriesdata.json")); JsonNode rootNode = new ObjectMapper().readTree(parser); Iterator<JsonNode> iter = rootNode.iterator(); ObjectNode currentNode; while (iter.hasNext()) { currentNode = (ObjectNode) iter.next(); int series_id = currentNode.path("series_id").asInt(); String title = currentNode.path("title").asText(); try { table.putItem(new Item().withPrimaryKey("series_id", series_id, "title", title).withJSON("info", currentNode.path("info").toString())); System.out.println("Series added: " + series_id + " " + title); } catch (Exception e) { System.err.println("Couldn't upload data: " + series_id + " " + title); System.err.println(e.getMessage()); break; } } parser.close(); } }
The code uses Jackson, the open source JSON processing library. Jackson is included in the AWS SDK for Java.
-
Build a project:
mvn package
As a result of running the command, the
SeriesLoadData.jar
file will be generated in thetarget/release/
folder. -
Create a file named
seriesdata.json
with the uploaded data, for example, in the nano editor:nano seriesdata.json
Copy the following code to the created file:
[{ "series_id": 1, "title": "IT Crowd", "info": { "release_date": "2006-02-03T00:00:00Z", "series_info": "The IT Crowd is a British sitcom produced by Channel 4, written by Graham Linehan, produced by Ash Atalla and starring Chris O'Dowd, Richard Ayoade, Katherine Parkinson, and Matt Berry" } }, { "series_id": 2, "title": "Silicon Valley", "info": { "release_date": "2014-04-06T00:00:00Z", "series_info": "Silicon Valley is an American comedy television series created by Mike Judge, John Altschuler and Dave Krinsky. The series focuses on five young men who founded a startup company in Silicon Valley" } }, { "series_id": 3, "title": "House of Cards", "info": { "release_date": "2013-02-01T00:00:00Z", "series_info": "House of Cards is an American political thriller streaming television series created by Beau Willimon. It is an adaptation of the 1990 BBC miniseries of the same name and based on the 1989 novel of the same name by Michael Dobbs" } }, { "series_id": 3, "title": "The Office", "info": { "release_date": "2005-03-24T00:00:00Z", "series_info": "The Office is an American mockumentary sitcom television series that depicts the everyday work lives of office employees in the Scranton, Pennsylvania, branch of the fictional Dunder Mifflin Paper Company" } }, { "series_id": 3, "title": "True Detective", "info": { "release_date": "2014-01-12T00:00:00Z", "series_info": "True Detective is an American anthology crime drama television series created and written by Nic Pizzolatto. The series, broadcast by the premium cable network HBO in the United States, premiered on January 12, 2014" } }, { "series_id": 4, "title": "The Big Bang Theory", "info": { "release_date": "2007-09-24T00:00:00Z", "series_info": "The Big Bang Theory is an American television sitcom created by Chuck Lorre and Bill Prady, both of whom served as executive producers on the series, along with Steven Molaro" } }, { "series_id": 5, "title": "Twin Peaks", "info": { "release_date": "1990-04-08T00:00:00Z", "series_info": "Twin Peaks is an American mystery horror drama television series created by Mark Frost and David Lynch that premiered on April 8, 1990, on ABC until its cancellation after its second season in 1991 before returning as a limited series in 2017 on Showtime" } } ]
-
Run the application:
java -jar target/release/SeriesLoadData.jar
Result:
Series added: 1 IT Crowd Series added: 2 Silicon Valley Series added: 3 House of Cards Series added: 3 The Office Series added: 3 True Detective Series added: 4 The Big Bang Theory Series added: 5 Twin Peaks
-
Create the
SeriesLoadData.py
file, for example, using thenano
editor:nano SeriesLoadData.py
Copy the following code to the created file:
Warning
For
<Document_API_endpoint>
, provide the prepared value.from decimal import Decimal import json import boto3 def load_series(series): ydb_docapi_client = boto3.resource('dynamodb', endpoint_url = "<Document_API_endpoint>") table = ydb_docapi_client.Table('Series') for serie in series: series_id = int(serie['series_id']) title = serie['title'] print("Series added:", series_id, title) table.put_item(Item = serie) if __name__ == '__main__': with open("seriesdata.json") as json_file: serie_list = json.load(json_file, parse_float = Decimal) load_series(serie_list)
-
Create a file named
seriesdata.json
with the uploaded data, for example, in the nano editor:nano seriesdata.json
Copy the following code to the created file:
[{ "series_id": 1, "title": "IT Crowd", "info": { "release_date": "2006-02-03T00:00:00Z", "series_info": "The IT Crowd is a British sitcom produced by Channel 4, written by Graham Linehan, produced by Ash Atalla and starring Chris O'Dowd, Richard Ayoade, Katherine Parkinson, and Matt Berry" } }, { "series_id": 2, "title": "Silicon Valley", "info": { "release_date": "2014-04-06T00:00:00Z", "series_info": "Silicon Valley is an American comedy television series created by Mike Judge, John Altschuler and Dave Krinsky. The series focuses on five young men who founded a startup company in Silicon Valley" } }, { "series_id": 3, "title": "House of Cards", "info": { "release_date": "2013-02-01T00:00:00Z", "series_info": "House of Cards is an American political thriller streaming television series created by Beau Willimon. It is an adaptation of the 1990 BBC miniseries of the same name and based on the 1989 novel of the same name by Michael Dobbs" } }, { "series_id": 3, "title": "The Office", "info": { "release_date": "2005-03-24T00:00:00Z", "series_info": "The Office is an American mockumentary sitcom television series that depicts the everyday work lives of office employees in the Scranton, Pennsylvania, branch of the fictional Dunder Mifflin Paper Company" } }, { "series_id": 3, "title": "True Detective", "info": { "release_date": "2014-01-12T00:00:00Z", "series_info": "True Detective is an American anthology crime drama television series created and written by Nic Pizzolatto. The series, broadcast by the premium cable network HBO in the United States, premiered on January 12, 2014" } }, { "series_id": 4, "title": "The Big Bang Theory", "info": { "release_date": "2007-09-24T00:00:00Z", "series_info": "The Big Bang Theory is an American television sitcom created by Chuck Lorre and Bill Prady, both of whom served as executive producers on the series, along with Steven Molaro" } }, { "series_id": 5, "title": "Twin Peaks", "info": { "release_date": "1990-04-08T00:00:00Z", "series_info": "Twin Peaks is an American mystery horror drama television series created by Mark Frost and David Lynch that premiered on April 8, 1990, on ABC until its cancellation after its second season in 1991 before returning as a limited series in 2017 on Showtime" } } ]
-
Run the program:
python SeriesLoadData.py
Result:
Series added: 1 IT Crowd Series added: 2 Silicon Valley Series added: 3 House of Cards Series added: 3 The Office Series added: 3 True Detective Series added: 4 The Big Bang Theory Series added: 5 Twin Peaks
-
Create the
SeriesLoadData.php
file, for example, using thenano
editor:nano SeriesLoadData.php
Copy the following code to the created file:
Warning
For
<Document_API_endpoint>
, provide the prepared value.<?php require 'vendor/autoload.php'; date_default_timezone_set('UTC'); use Aws\DynamoDb\Exception\DynamoDbException; use Aws\DynamoDb\Marshaler; $sdk = new Aws\Sdk([ 'endpoint' => '<Document_API_endpoint>', 'region' => 'ru-central1', 'version' => 'latest' ]); $dynamodb = $sdk->createDynamoDb(); $marshaler = new Marshaler(); $tableName = 'Series'; $Series = json_decode(file_get_contents('seriesdata.json'), true); foreach ($Series as $movie) { $series_id = $movie['series_id']; $title = $movie['title']; $info = $movie['info']; $json = json_encode([ 'series_id' => $series_id, 'title' => $title, 'info' => $info ]); $params = [ 'TableName' => $tableName, 'Item' => $marshaler->marshalJson($json) ]; try { $result = $dynamodb->putItem($params); echo "Series added: " . $movie['series_id'] . " " . $movie['title'] . "\n"; } catch (DynamoDbException $e) { echo "Couldn't upload data:\n"; echo $e->getMessage() . "\n"; break; } } ?>
The
Marshaler
class includes methods for converting JSON documents and PHP arrays into the YDB format. In this code,$marshaler->marshalJson($json)
accepts JSON data and converts it into a YDB record. -
Create a file named
seriesdata.json
with the uploaded data, for example, in the nano editor:nano seriesdata.json
Copy the following code to the created file:
[{ "series_id": 1, "title": "IT Crowd", "info": { "release_date": "2006-02-03T00:00:00Z", "series_info": "The IT Crowd is a British sitcom produced by Channel 4, written by Graham Linehan, produced by Ash Atalla and starring Chris O'Dowd, Richard Ayoade, Katherine Parkinson, and Matt Berry" } }, { "series_id": 2, "title": "Silicon Valley", "info": { "release_date": "2014-04-06T00:00:00Z", "series_info": "Silicon Valley is an American comedy television series created by Mike Judge, John Altschuler and Dave Krinsky. The series focuses on five young men who founded a startup company in Silicon Valley" } }, { "series_id": 3, "title": "House of Cards", "info": { "release_date": "2013-02-01T00:00:00Z", "series_info": "House of Cards is an American political thriller streaming television series created by Beau Willimon. It is an adaptation of the 1990 BBC miniseries of the same name and based on the 1989 novel of the same name by Michael Dobbs" } }, { "series_id": 3, "title": "The Office", "info": { "release_date": "2005-03-24T00:00:00Z", "series_info": "The Office is an American mockumentary sitcom television series that depicts the everyday work lives of office employees in the Scranton, Pennsylvania, branch of the fictional Dunder Mifflin Paper Company" } }, { "series_id": 3, "title": "True Detective", "info": { "release_date": "2014-01-12T00:00:00Z", "series_info": "True Detective is an American anthology crime drama television series created and written by Nic Pizzolatto. The series, broadcast by the premium cable network HBO in the United States, premiered on January 12, 2014" } }, { "series_id": 4, "title": "The Big Bang Theory", "info": { "release_date": "2007-09-24T00:00:00Z", "series_info": "The Big Bang Theory is an American television sitcom created by Chuck Lorre and Bill Prady, both of whom served as executive producers on the series, along with Steven Molaro" } }, { "series_id": 5, "title": "Twin Peaks", "info": { "release_date": "1990-04-08T00:00:00Z", "series_info": "Twin Peaks is an American mystery horror drama television series created by Mark Frost and David Lynch that premiered on April 8, 1990, on ABC until its cancellation after its second season in 1991 before returning as a limited series in 2017 on Showtime" } } ]
-
Run the program:
php SeriesLoadData.php
Result:
Series added: 1 IT Crowd Series added: 2 Silicon Valley Series added: 3 House of Cards Series added: 3 The Office Series added: 3 True Detective Series added: 4 The Big Bang Theory Series added: 5 Twin Peaks
-
Create the
SeriesLoadData.js
file, for example, using thenano
editor:nano SeriesLoadData.js
Copy the following code to the created file:
Warning
For
<Document_API_endpoint>
, provide the prepared value.const AWS = require("@aws-sdk/client-dynamodb"); const fs = require("fs") // Credentials should be defined via environment variables AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID const dynamodb = new AWS.DynamoDBClient({ region: "ru-central1", endpoint: "<Document_API_endpoint>", }); console.log("Uploading series to YDB. Please wait..."); const allSeries = JSON.parse(fs.readFileSync('seriesdata.json', 'utf8')); allSeries.forEach(function(series) { dynamodb.send(new AWS.PutItemCommand({ TableName: "Series", Item: { "series_id": series.series_id, "title": series.title, "info": series.info } })) .then(() => { console.log("Series added:", series.title); }) .catch(err => { console.error("Couldn't add series", series.title, ". Error JSON:", JSON.stringify(err, null, 2)); }) });
-
Create a file named
seriesdata.json
with the uploaded data, for example, in the nano editor:nano seriesdata.json
Copy the following code to the created file:
[{ "series_id": 1, "title": "IT Crowd", "info": { "release_date": "2006-02-03T00:00:00Z", "series_info": "The IT Crowd is a British sitcom produced by Channel 4, written by Graham Linehan, produced by Ash Atalla and starring Chris O'Dowd, Richard Ayoade, Katherine Parkinson, and Matt Berry" } }, { "series_id": 2, "title": "Silicon Valley", "info": { "release_date": "2014-04-06T00:00:00Z", "series_info": "Silicon Valley is an American comedy television series created by Mike Judge, John Altschuler and Dave Krinsky. The series focuses on five young men who founded a startup company in Silicon Valley" } }, { "series_id": 3, "title": "House of Cards", "info": { "release_date": "2013-02-01T00:00:00Z", "series_info": "House of Cards is an American political thriller streaming television series created by Beau Willimon. It is an adaptation of the 1990 BBC miniseries of the same name and based on the 1989 novel of the same name by Michael Dobbs" } }, { "series_id": 3, "title": "The Office", "info": { "release_date": "2005-03-24T00:00:00Z", "series_info": "The Office is an American mockumentary sitcom television series that depicts the everyday work lives of office employees in the Scranton, Pennsylvania, branch of the fictional Dunder Mifflin Paper Company" } }, { "series_id": 3, "title": "True Detective", "info": { "release_date": "2014-01-12T00:00:00Z", "series_info": "True Detective is an American anthology crime drama television series created and written by Nic Pizzolatto. The series, broadcast by the premium cable network HBO in the United States, premiered on January 12, 2014" } }, { "series_id": 4, "title": "The Big Bang Theory", "info": { "release_date": "2007-09-24T00:00:00Z", "series_info": "The Big Bang Theory is an American television sitcom created by Chuck Lorre and Bill Prady, both of whom served as executive producers on the series, along with Steven Molaro" } }, { "series_id": 5, "title": "Twin Peaks", "info": { "release_date": "1990-04-08T00:00:00Z", "series_info": "Twin Peaks is an American mystery horror drama television series created by Mark Frost and David Lynch that premiered on April 8, 1990, on ABC until its cancellation after its second season in 1991 before returning as a limited series in 2017 on Showtime" } } ]
-
Run the program:
node SeriesLoadData.js
Result:
Uploading series to YDB. Please wait... Series added: The Office Series added: IT Crowd Series added: House of Cards Series added: The Big Bang Theory Series added: Twin Peaks Series added: Silicon Valley Series added: True Detective
-
Create the
SeriesLoadData.rb
file, for example, using thenano
editor:nano SeriesLoadData.rb
Copy the following code to the created file:
Warning
For
<Document_API_endpoint>
, provide the prepared value.require 'aws-sdk-dynamodb' require 'json' $series_counter = 0 $total_series = 0 def add_item_to_table(dynamodb_client, table_item) dynamodb_client.put_item(table_item) $series_counter += 1 puts "Uploading series #{$series_counter}/#{$total_series}: " \ "'#{table_item[:item]['title']} " \ "(#{table_item[:item]['series_id']})'." rescue StandardError => e puts "Error uploading series '#{table_item[:item]['title']} " \ "(#{table_item[:item]['series_id']})': #{e.message}" puts 'Program stopped.' exit 1 end def run_me region = 'ru-central1' table_name = 'Series' data_file = 'seriesdata.json' Aws.config.update( endpoint: '<Document_API_endpoint>', region: region ) dynamodb_client = Aws::DynamoDB::Client.new file = File.read(data_file) series = JSON.parse(file) $total_series = series.count puts "#{$total_series} series from file '#{data_file}' will be uploaded" \ "to table '#{table_name}'..." series.each do |seria| table_item = { table_name: table_name, item: seria } add_item_to_table(dynamodb_client, table_item) end puts 'Uploading completed successfully.' end run_me if $PROGRAM_NAME == __FILE__
-
Create a file named
seriesdata.json
with the uploaded data, for example, in the nano editor:nano seriesdata.json
Copy the following code to the created file:
[{ "series_id": 1, "title": "IT Crowd", "info": { "release_date": "2006-02-03T00:00:00Z", "series_info": "The IT Crowd is a British sitcom produced by Channel 4, written by Graham Linehan, produced by Ash Atalla and starring Chris O'Dowd, Richard Ayoade, Katherine Parkinson, and Matt Berry" } }, { "series_id": 2, "title": "Silicon Valley", "info": { "release_date": "2014-04-06T00:00:00Z", "series_info": "Silicon Valley is an American comedy television series created by Mike Judge, John Altschuler and Dave Krinsky. The series focuses on five young men who founded a startup company in Silicon Valley" } }, { "series_id": 3, "title": "House of Cards", "info": { "release_date": "2013-02-01T00:00:00Z", "series_info": "House of Cards is an American political thriller streaming television series created by Beau Willimon. It is an adaptation of the 1990 BBC miniseries of the same name and based on the 1989 novel of the same name by Michael Dobbs" } }, { "series_id": 3, "title": "The Office", "info": { "release_date": "2005-03-24T00:00:00Z", "series_info": "The Office is an American mockumentary sitcom television series that depicts the everyday work lives of office employees in the Scranton, Pennsylvania, branch of the fictional Dunder Mifflin Paper Company" } }, { "series_id": 3, "title": "True Detective", "info": { "release_date": "2014-01-12T00:00:00Z", "series_info": "True Detective is an American anthology crime drama television series created and written by Nic Pizzolatto. The series, broadcast by the premium cable network HBO in the United States, premiered on January 12, 2014" } }, { "series_id": 4, "title": "The Big Bang Theory", "info": { "release_date": "2007-09-24T00:00:00Z", "series_info": "The Big Bang Theory is an American television sitcom created by Chuck Lorre and Bill Prady, both of whom served as executive producers on the series, along with Steven Molaro" } }, { "series_id": 5, "title": "Twin Peaks", "info": { "release_date": "1990-04-08T00:00:00Z", "series_info": "Twin Peaks is an American mystery horror drama television series created by Mark Frost and David Lynch that premiered on April 8, 1990, on ABC until its cancellation after its second season in 1991 before returning as a limited series in 2017 on Showtime" } } ]
-
Run the program:
ruby SeriesLoadData.rb
Result:
7 series will be uploaded from file 'seriesdata.json' to the table 'Series'... Uploading series 1/7: 'IT Crowd (1)'. Uploading series 2/7: 'Silicon Valley (2)'. Uploading series 3/7: 'House of Cards (3)'. Uploading series 4/7: 'The Office (3)'. Uploading series 5/7: 'True Detective (3)'. Uploading series 6/7: 'The Big Bang Theory (4)'. Uploading series 7/7: 'Twin Peaks (5)'. Uploading completed successfully.