dimanche 8 septembre 2019

Scheduled process to copy files out of S3 into a temp-folder in Ubuntu 18.04

Looking for recommendations for the following scenario:

In an ubuntu 18.04 server, every 1 minute check for new files in an AWS S3 bucket, fetch only the newest file to a temp folder at the end of the day remove them.

It should be automated in bash.

I proposed using aws s3 events notification, queues, lambda but it was defined that is best to keep it simple.

i am looking for recommendations for the steps described below:

For step 1 i was doing aws s3 ls | awk (FUNCTION to filter files updated within the last minute) then i realized that it was best to do it with grep

0-Cron job should run from 7:00 to 23:00 every minute     
1-List the files updated to S3 bucket during the past 1 minute
2-List the files in a temp-encrypted folder in ubuntu 18.03
3-Are the files listed in step 1 already downloaded in folder temp-encrypted from step 2
4-If the files are not already donloaded > download newest files from S3 bucket into temp-encrypted
5-At end of the day 23:00 take a record of the last files fetched from s3
6-run cleanup script at end of the day to remove everything in temp-encrypted

I attach a diagram with the intended process and infrastructure design. Intended process

Aucun commentaire:

Enregistrer un commentaire