Cannot retrieve data on S3 Bucket from a deployed lambda

I invoke a step function from a lambda function, that processes data and stores them in a S3 bucket. From within this lambda function I tried to download the data but I am getting an error message with "AccessDenied" (see further below).

If I run this lambda function a second time, I do not get any error and the execution terminates successfully. My understanding is that during the first run, the data are not yet stored while I am trying to download them, which would explain why on my second try this is working good.

I am using the pair async/await thinking that this would be enough to hold the execution while waiting for the data to be stored. Is there something I am not doing correctly?

Here is an extract of the code (step function not detailed here):

async function downloadData(){
    var rawData = await s3.getObject({Bucket: 'myBucket/', Key: 'myData.json'}).promise();
    var data = JSON.parse(rawData.Body.toString('utf-8'));
    return data;

async function invokeStepFunction(){
    const stepfunctions = new AWS.StepFunctions();
    var params = {
        stateMachineArn: process.env.state_machine_arn,
        input: JSON.stringify({"Bucket": 'myBucket/'})
    await stepfunctions.startExecution(params).promise();

const AWS = require('aws-sdk');
AWS.config.update({region: process.env.region});
const s3 = new AWS.S3({apiVersion: '2006-03-01'});

module.exports.handler = async (event, context) => {
    await invokeStepFunction();
    const data = await downloadData();

and this is the error message:

{"errorType":"AccessDenied","errorMessage":"AccessDenied","code":"AccessDenied","message":"AccessDenied","region":null,"time":"2020-03-25T13:13:20.832Z","requestId":"...","extendedRequestId":"...","statusCode":403,"retryable":false,"retryDelay":91.97041111587372,"stack":["AccessDenied: Access Denied","    at Request.extractError (/var/runtime/node_modules/aws-sdk/lib/services/s3.js:816:35)","    at Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:106:20)","    at Request.emit (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:78:10)","    at Request.emit (/var/runtime/node_modules/aws-sdk/lib/request.js:683:14)","    at Request.transition (/var/runtime/node_modules/aws-sdk/lib/request.js:22:10)","    at AcceptorStateMachine.runTo (/var/runtime/node_modules/aws-sdk/lib/state_machine.js:14:12)","    at /var/runtime/node_modules/aws-sdk/lib/state_machine.js:26:10","    at Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:38:9)","    at Request.<anonymous> (/var/runtime/node_modules/aws-sdk/lib/request.js:685:12)","    at Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:116:18)"]}

1 answer

  • answered 2020-03-25 17:19 jarmod

    When you await the step function startExecution invocation, you are waiting for AWS Step Functions to indicate that it has received your request to begin execution of the Step Function. It does not indicate that the Step Function has itself run to completion.

    So, you're executing downloadData before the Step Function actually stored the data in S3, and the file does not exist in S3 first time around. When you call it later, that downloadData appears to succeed but it is almost certainly downloading the object previously stored in S3 (from the first run).

    You need to execute your download step sometime after the download has actually happened. You could, for example, make the download an additional step at the end of the upload Step Function.