Finding and fixing errors when creating a demo stand
If you encounter problems while creating a SpeechKit Hybrid demo stand, run diagnostics to detect errors:
-
ssh <username>@<VM_public_IP_address>Where
<username>is the VM account username. You can find the VM's public IP address on the VM page in the management console . -
Check whether ports
8080and9080are open to receive client requests:telnet <VM_public_address> 8080 && telnet <VM_public_address> 9080In the command, specify the public IP address of the VM that was created with the infrastructure. You can learn how to get the IP address of a VM here.
-
Check the list of downloaded Docker images:
docker images --digestsMake sure the required images are there. They are loaded into the Yandex Container Registry registry after you provide the registry ID to the SpeechKit command.
Expected result:
REPOSITORY TAG DIGEST IMAGE ID CREATED SIZE cr.yandex/crp33...7i/release/stt/v100/stt_server 0.21 sha256:83245...6b 0d1...89 ... 15.3GB cr.yandex/crp33...7i/release/tts/v100/tts_server 0.21 sha256:41c1f...ea d3a...7d ... 16.1GB cr.yandex/crp33...7i/release/envoy 0.21 sha256:853ed...cb 6f7...31 ... 220MB cr.yandex/crp33...7i/release/license_server 0.21 sha256:44d24...3d 59e...62 ... 1.23GBIf image labels were changed, make sure you used the right Docker image during load testing. To do this, in the
DIGESTcolumn, compare the hash sums of the image you used and the image in the resulting list. -
Make sure Docker containers are successfully launched from the SpeechKit Hybrid images:
docker ps -aExample of expected result:
CONTAINER ID IMAGE ... STATUS ... 659...a0 cr.yandex/crp33...7i/release/stt/v100/stt_server:0.21 ... Up About an hour ... af3...1f cr.yandex/crp33...7i/release/tts/v100/tts_server:0.21 ... Up About an hour ... e42...36 cr.yandex/crp33...7i/release/envoy:0.21 ... Up About an hour ... a4a...43 cr.yandex/crp33...7i/release/license_server:0.21 ... Up About an hour ... -
Check the list of open network connections and the network configuration:
-
Install the
netstatutility:sudo apt install net-tools -
Make sure SpeechKit Hybrid services are ready to serve network connections on their dedicated ports. For a list of ports, see the
docker-compose.yamlfile. It is stored in the node-deploy.tf file, in theCOMPOSE_V100_STT_TTSvariable.Run this command to get information about network connections of services:
sudo netstat -tulpn && sudo ip addrExpected result:
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 1582/envoy tcp 0 0 0.0.0.0:9080 0.0.0.0:* LISTEN 1582/envoy tcp 0 0 0.0.0.0:17882 0.0.0.0:* LISTEN 1688/asr_server tcp 0 0 0.0.0.0:17982 0.0.0.0:* LISTEN 1637/tts_server tcp 0 0 0.0.0.0:9091 0.0.0.0:* LISTEN 1582/envoy tcp6 0 0 :::8085 :::* LISTEN 1581/java tcp6 0 0 :::8086 :::* LISTEN 1581/java tcp6 0 0 :::8087 :::* LISTEN 1581/java tcp6 0 0 :::17880 :::* LISTEN 1688/asr_server tcp6 0 0 :::17980 :::* LISTEN 1637/tts_server tcp6 0 0 :::9087 :::* LISTEN 1581/java tcp6 0 0 :::8003 :::* LISTEN 1581/java
-
-
Make sure the GPU driver is healthy. Its integration with the used containerization system is one of the hardware requirements.
-
Find out the NVIDIA kernel version:
cat /proc/driver/nvidia/versionExample of expected result:
NVRM version: NVIDIA UNIX x86_64 Kernel Module 470.199.02 ... -
Check the GPU driver status:
sudo nvidia-smiExample of expected result:
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.199.02 Driver Version: 470.199.02 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla V100-SXM2... Off | 00000000:8B:00.0 Off | 0 | | N/A 33C P0 36W / 300W | 0MiB / 32510MiB | 2% Default | +-------------------------------+----------------------+----------------------+ -
Make sure the GPU driver has been successfully integrated into the containerization system:
nvidia-container-cli infoExample of expected result:
NVRM version: 470.199.02 CUDA version: 11.4 Device Index: 0 Device Minor: 0 Model: Tesla V100-SXM2-32GB Brand: Tesla GPU UUID: GPU-1af...cb Bus Location: 00000000:8b:00.0 Architecture: 7.0
-
-
Check for
WARNING,ERROR,EMERG, orALERTerrors in the STDOUT output of the containers. To do this, dump the output into text files. Run the following command in theyc-speechkit-hybrid-deploymentrepository directory:mkdir -p logs ; cd ./logs for c in $(docker ps --format '{{.Names}}' | awk '{print $NF}'); do echo $c && docker logs $c &> $c.log; doneIf contacting support
, report the command you executed and send the text files you got. -
Study the contents of the
docker-compose.yamlfile used to launch Docker containers.docker-compose.yamlis described in thenode-deploy.tffile, in theCOMPOSE_V100_STT_TTSvariable. The contents of the variable are automatically dumped into thedocker-compose.yamlfile. It is hosted and built on the VM that runs SpeechKit Hybrid.You may get errors during the build. To process them, make sure the contents of the
docker-compose.yamlfile are consistent with the environment configuration information. This information was collected using the steps described above.