Question

performance is bad when i am using JDBC lookup in DataCollector

Forum|Forum|3 years ago
December 28, 2021
6 replies
181 views

ashok verma
Discovered Fame

i am reading a file from S3 bucket and doing lookup against data from Postgress DB using JDBC lookup using Data collector.

in source i have some 341k records against 341 records in postgress DB.

my observations are

1.it taking 30min to process 50K records.

2.some records are going to error even matching one present in DB

3.i have tried by enabling local cache.

HEMANTH14194
Fan
Forum|Forum|3 years ago
January 26, 2022

Hi,

did you find any solution to this issue?

Like

+3

Dash
Senior Technical Evangelist and Developer Advocate at Snowflake
Forum|Forum|3 years ago
January 26, 2022

Hi @ashok verma,

Have you tried configuring and/or adjusting Eviction Policy Type, Maximum Entries to Cache and Minimum Idle Connections properties? According to the docs “Note: When local caching is enabled, configure this property carefully to avoid monopolizing Data Collector resources. For more information, see Using Additional Threads.”

Dash | https://twitter.com/iamontheinet

Like

ashok verma
Author
Discovered Fame
Forum|Forum|3 years ago
January 28, 2022

i haven’t find any solution yet and i tries those combination also but didn’t work

Like

Dimas Cabré i Chacón
StreamSets Employee
Forum|Forum|3 years ago
February 3, 2022

Hi @ashok verma . If you could share your pipeline (feel free to strip sensible information), we can have a detailed look at this. Also, having the logs when running it at debug log level would be very helpful.