lundi 10 mai 2021

how do I properly build a pipeline architecture where all my data sources shares the same structure?

​ I am trying to build a data stream pipeline where multiples N data sources it will stream my data to a node manager that it will save or temporary save to postgresql similar to a chat where a user sends a message then the other party receive that data and save it, however I want to face some issues first should I stream my data from each data source as a list of dicts or a single dict , and node manager it will group all data to perform a bulk insert

the example: https://avleonov.com/wp-content/uploads/2017/05/home-grown_vulnerability_database.png

my approach

  • celery for task queue (2 tasks one runs every 24 hours, and another each hour)
  • node manger that acts as a chat to stream data
  • scrapers streaming data to node manager
  • node manager saves data < I am not sure if a list of dicts or single JSON object = many queries to DB

cvedetails ​ for speed cvssv2, cvssv3 it was removed ​

{
   "cveid":"CVE-2009-20001",
   "summary":"An issue was discovered in MantisBT before 2.24.5. It associates a unique cookie string with each user. This string is not reset upon logout (i.e., the user session is still considered valid and active), allowing an attacker who somehow gained access to a user's cookie to login as them.",
   "vulnerability_type":"613",
   "affected_products":"Mantisbt Mantisbt *",
   "advisory_url":"https://mantisbt.org/bugs/view.php?id=27976",
   "detection_date":"2021-03-07"
}

​ ​ NVD ​

{
   "cveid":"CVE-1999-0002",
   "summary":"Buffer overflow in NFS mountd gives root access to remote attackers, mostly in Linux systems.",
   "vulnerability_type":"CWE-119",
   "affected_products":"N/A",
   "cvssv2":{
      "version":"2.0",
      "vectorString":"AV:N/AC:L/Au:N/C:C/I:C/A:C",
      "accessVector":"NETWORK",
      "accessComplexity":"LOW",
      "authentication":"NONE",
      "confidentialityImpact":"COMPLETE",
      "integrityImpact":"COMPLETE",
      "availabilityImpact":"COMPLETE",
      "baseScore":"Decimal(""10.0"")"
   },
   "cvssv3":{
      
   },
   "advisory_url":[
      "http://www.securityfocus.com/bid/121",
      "ftp://patches.sgi.com/support/free/security/advisories/19981006-01-I",
      "http://www.ciac.org/ciac/bulletins/j-006.shtml"
   ],
   "detection_date":datetime.datetime(1998,10,12,4,0,"tzinfo=tzutc())"
}

https://avleonov.com/wp-content/uploads/2017/05/home-grown_vulnerability_database.png

Aucun commentaire:

Enregistrer un commentaire