Forum: War Ensemble BBS

Help with Streaming and Chunk Processing for Large JSON Data (60 GB)from Kenna API

From Asif Ali Hirekumbi@asifali.ha@gmail.com to comp.lang.python on Fri Sep 27 11:47:12 2024

From Newsgroup: comp.lang.python

Dear Python Experts,
I am working with the Kenna Application's API to retrieve vulnerability
data. The API endpoint provides a single, massive JSON file in gzip format, approximately 60 GB in size. Handling such a large dataset in one go is
proving to be quite challenging, especially in terms of memory management.
I am looking for guidance on how to efficiently stream this data and
process it in chunks using Python. Specifically, I am wondering if there’s
a way to use the requests library or any other libraries that would allow
us to pull data from the API endpoint in a memory-efficient manner.
Here are the relevant API endpoints from Kenna:
- Kenna API Documentation
<https://apidocs.kennasecurity.com/reference/welcome>
- Kenna Vulnerabilities Export
<https://apidocs.kennasecurity.com/reference/retrieve-data-export>
If anyone has experience with similar use cases or can offer any advice, it would be greatly appreciated.
Thank you in advance for your help!
Best regards
Asif Ali
--- Synchronet 3.20a-Linux NewsLink 1.114