Due to licensing constraints, StreamSets is not permitted to ship some of the most popular JDBC drivers. This can lead to problems when deploying and setting up the vendor-supplied JDBC drivers. In most cases, incorrectly setting up the drivers will cause JDBC_00 and JDBC_06 errors.
The first type of error occur when the driver's jar file is not correctly installed. In this case, the trace in sdc.log indicates that it's "Unable to get driver instance". This is because the driver jar file is not available to Data Collector's JVM. You should verify the installation of the JDBC driver jar file - follow the instructions to install external libraries here.
This sample error and stack trace is from an incorrect setup using MySQL.
2018-03-06 15:09:34,871 [user:*admin] [pipeline:jdbc origin -> trash/jdbcorigintrashba2851a6-4013-45cd-aa0d-57f703cde9ce] [runner:] [thread:preview-pool-1-thread-1] ERROR JdbcSource - Cannot connect to specified database:
com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: java.lang.RuntimeException: Unable to get driver instance for jdbcUrl=jdbc:mysql://localhost:3306/default
com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: java.lang.RuntimeException: Unable to get driver instance for jdbcUrl=jdbc:mysql://localhost:3306/default
at com.streamsets.pipeline.lib.jdbc.JdbcUtil.createDataSourceForRead(JdbcUtil.java:783)
at com.streamsets.pipeline.stage.origin.jdbc.JdbcSource.init(JdbcSource.java:220)
at com.streamsets.pipeline.api.base.BaseStage.init(BaseStage.java:48)
at com.streamsets.pipeline.configurablestage.DStage.init(DStage.java:36)
In the second case, the driver jar file is installed in the correct directory, correctly pointed to by the STREAMSETS_LIBRARIES_EXTRA_DIR environment variable, but the sdc-security.policy has not been updated. In this case, the driver's jar file is accessible to the Data Collector JVM, but beyond opening the jar file, execution of the code in the jar file is not permitted. You can verify that the driver jar file has been opened by Data Collector's JVM by using the `lsof -p (pid)| grep (jar file name)` command.
This trace - "Could not initialize class" is typical from a MySQL installation with this problem.
2018-03-06 15:06:32,848 [user:*admin] [pipeline:jdbc origin -> trash/jdbcorigintrashba2851a6-4013-45cd-aa0d-57f703cde9ce] [runner:] [thread:ProductionPipelineRunnable-jdbcorigintrashba2851a6-4013-45cd-aa0d-57f703cde9ce-jdbc origin -> trash] ERROR ProductionPipelineRunnable - An exception occurred while running the pipeline,
com.streamsets.datacollector.runner.PipelineRuntimeException: CONTAINER_0702 - Pipeline initialization error: java.lang.NoClassDefFoundError: Could not initialize class com.mysql.jdbc.StringUtils
com.streamsets.datacollector.runner.PipelineRuntimeException: CONTAINER_0702 - Pipeline initialization error: java.lang.NoClassDefFoundError: Could not initialize class com.mysql.jdbc.StringUtils
at com.streamsets.datacollector.execution.runner.common.ProductionPipeline.run(ProductionPipeline.java:104)
at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunnable.run(ProductionPipelineRunnable.java:74)
at com.streamsets.datacollector.execution.runner.standalone.StandaloneRunner.start(StandaloneRunner.java:754)
at com.streamsets.datacollector.execution.AbstractRunner.lambda$scheduleForRetries$0(AbstractRunner.java:173)
at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:227)
In the next type of error, the driver jar is correctly installed, the STREAMSETS_LIBRARIES_EXTRA_DIR environment variable is correctly set. The security policy is updated, but the driver needs to be specified in the Legacy Drivers Tab. There are two reasons why this may be required. First, JDBC drivers less than version 4 are not dynamically picked up the the JVM, so setting the driver class name is required.
The other problem, when using a version 4 JDBC driver there seems to be an odd, intermittent, underlying problem in the JVM or Data Collector in which the drivers are not fully registered by the JVM for use by Data Collector. Currently this issue is being tracked in SDC-4911.
The easiest way to determine if this is the problem is to:
ps -ef | grep BootstrapMain — (or use `jps`) to get the pid for Data Collector
lsof -p (pid) | grep (driver jar file) — see if the driver's jar file has been opened by the Data Collector JVM.
Check the trace in the sdc.log after running the pipeline and having this failure. Trace which indicates correctly installed JDBC drivers will look something like the following - a listing of the drivers which are correctly installed and are available:
2018-03-02 08:02:26,693 [user:*admin] [pipeline:JDBC -> trash/bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b] [runner:] [thread:ProductionPipelineRunnable-bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b-JDBC -> trash] INFO JdbcUtil - Registered JDBC drivers:
2018-03-02 08:02:26,694 [user:*admin] [pipeline:JDBC -> trash/bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b] [runner:] [thread:ProductionPipelineRunnable-bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b-JDBC -> trash] INFO JdbcUtil - Driver class org.postgresql.Driver (version 42.0)
2018-03-02 08:02:26,694 [user:*admin] [pipeline:JDBC -> trash/bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b] [runner:] [thread:ProductionPipelineRunnable-bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b-JDBC -> trash] INFO JdbcUtil - Driver class com.mysql.fabric.jdbc.FabricMySQLDriver (version 5.1)
2018-03-02 08:02:26,699 [user:*admin] [pipeline:JDBC -> trash/bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b] [runner:] [thread:ProductionPipelineRunnable-bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b-JDBC -> trash] INFO JdbcUtil - Driver class oracle.jdbc.OracleDriver (version 11.2)
2018-03-02 08:02:26,699 [user:*admin] [pipeline:JDBC -> trash/bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b] [runner:] [thread:ProductionPipelineRunnable-bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b-JDBC -> trash] INFO JdbcUtil - Driver class com.mysql.jdbc.Driver (version 5.1)
2018-03-02 08:02:26,704 [user:*admin] [pipeline:JDBC -> trash/bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b] [runner:] [thread:ProductionPipelineRunnable-bobJDBC86f32830-31aa-47ba-b624-424e1d9ad55b-JDBC -> trash] INFO HikariDataSource - HikariPool-0 - is starting
.
If the driver jar file is in lsof's list of open files, but there are no drivers listed in the sdc.log after the "Registered JDBC drivers" trace, then this may be the problem. If the driver jar files are NOT found in the lsof list, we should check the previous steps- correct installation of the jar file, correct location specified by the STREAMSETS_LIBRARIES_EXTRA_DIR environment variable and that sdc-security.policy has been correctly updated. Also, remember that Data Collector needs to be restarted after any of the changes listed above; they are not picked up dynamically as Data Collector runs.
Here is sample trace when trying to use MySQL when the drivers are correctly installed but not registered. The CONTAINER_0800 error may be a clue in this trace:
2018-02-12 08:51:32,628 [user:*admin] [pipeline:myJdbc:466477c7-cbfd-4d50-8ec4-856e9a1c113a] [runner:] [thread:ProductionPipelineRunnable-myJdbc:466477c7-cbfd-4d50-8ec4-856e9a1c113a] ERROR ProductionPipelineRunnable - An exception occurred while running the pipeline, com.streamsets.datacollector.runner.PipelineRuntimeException: CONTAINER_0800 - Pipeline 'myJdbc:466477c7-cbfd-4d50-8ec4-856e9a1c113a' validation error : JDBC_00 - Cannot connect to specified database: com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: java.lang.RuntimeException: Unable to get driver instance for jdbcUrl=jdbc:mysql://localhost:3306/default
com.streamsets.datacollector.runner.PipelineRuntimeException: CONTAINER_0800 - Pipeline 'myJdbc:466477c7-cbfd-4d50-8ec4-856e9a1c113a' validation error : JDBC_00 - Cannot connect to specified database: com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: java.lang.RuntimeException: Unable to get driver instance for jdbcUrl=jdbc:mysql://localhost:3306/default
at com.streamsets.datacollector.execution.runner.common.ProductionPipeline.run(ProductionPipeline.java:131)
at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunnable.run(ProductionPipelineRunnable.java:74)
at com.streamsets.datacollector.execution.runner.standalone.StandaloneRunner.start(StandaloneRunner.java:754)
at com.streamsets.datacollector.execution.runner.common.AsyncRunner.lambda$start$3(AsyncRunner.java:152)
at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.lambda$call$0(SafeScheduledExecutorService.java:227)
at com.streamsets.datacollector.security.GroupsInScope.execute(GroupsInScope.java:33)
at com.streamsets.pipeline.lib.executor.SafeScheduledExecutorService$SafeCallable.call(SafeScheduledExecutorService.java:223)
In the last set of problems, everything is installed correctly for the JDBC connection, we see the correct trace for registered JDBC drivers in the sdc.log. Errors such as these are more likely to be related to connectivity or the database not accepting new connections.
When looking at Connection Timeout errors, the trace includes information from the underlying JDBC driver. The first example is failing to connect to a Teradata instance:
Pipeline Status: START_ERROR: QUERY_EXECUTOR_002 - Can't open connection: JDBC_06 - Failed to initialize connection pool: com.zaxxer.hikari.pool.PoolInitializationException: Exception during pool initialization: [Teradata JDBC Driver] [TeraJDBC 15.10.00.26] [Error 1277] [SQLState 08S01] Login timeout for Connection to devbox.blah.com
Mon Mar 05 01:30:45 EST 2018 socket orig=devbox.blah.com cid=67b74247 sess=0 java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at
When this error occurs - the driver is installed correctly, the environment is set up correctly and the sdc-security.policy file has been correctly updated.
At this point, the problem will likely be in connectivity, and you should check the JDBC connection string for the correct hostname, port and schema. In addition, you should check whether any command line tools can connect to the destination database from the same machine on which data collector is installed.
In this example, Data Collector is trying to connect to a MySQL instance which is firewalled off.
com.streamsets.datacollector.runner.PipelineRuntimeException: CONTAINER_0800 - Pipeline 'jdbcorigintrashba2851a6-4013-45cd-aa0d-57f703cde9ce' validation error : JDBC_00 - Cannot connect to specified database:
com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: com.zaxxer.hikari.pool.PoolInitializationException: Exception during pool initialization: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at com.streamsets.datacollector.execution.runner.common.ProductionPipeline.run(ProductionPipeline.java:131)
at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunnable.run(ProductionPipelineRunnable.java:74)
at com.streamsets.datacollector.execution.runner.standalone.StandaloneRunner.start(StandaloneRunner.java:754)
at com.streamsets.datacollector.execution.AbstractRunner.lambda$scheduleForRetries$0(AbstractRunner.java:173)
The last type of error is pretty obvious when using MySQL, it occurs when the user's credentials are not correct for the database, in this case we're providing in invalid user name and password to a MySQL database:
2018-03-08 08:59:12,226 [user:*admin] [pipeline:jdbc origin -> trash/jdbcorigintrashba2851a6-4013-45cd-aa0d-57f703cde9ce] [runner:] [thread:ProductionPipelineRunnable-jdbcorigintrashba2851a6-4013-45cd-aa0d-57f703cde9ce-jdbc origin -> trash] ERROR ProductionPipelineRunnable - An exception occurred while running the pipeline, com.streamsets.datacollector.runner.PipelineRuntimeException: CONTAINER_0800 - Pipeline 'jdbcorigintrashba2851a6-4013-45cd-aa0d-57f703cde9ce' validation error : JDBC_00 - Cannot connect to specified database: com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: com.zaxxer.hikari.pool.PoolInitializationException: Exception during pool initialization: Access denied for user 'x'@'172.17.0.1' (using password: YES)
com.streamsets.datacollector.runner.PipelineRuntimeException: CONTAINER_0800 - Pipeline 'jdbcorigintrashba2851a6-4013-45cd-aa0d-57f703cde9ce' validation error : JDBC_00 - Cannot connect to specified database: com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: com.zaxxer.hikari.pool.PoolInitializationException: Exception during pool initialization: Access denied for user 'x'@'172.17.0.1' (using password: YES)
at com.streamsets.datacollector.execution.runner.common.ProductionPipeline.run(ProductionPipeline.java:131)
at com.streamsets.datacollector.execution.runner.common.ProductionPipelineRunnable.run(ProductionPipelineRunnable.java:74)
October 07, 2020 17:10