Azcopy interprets source as local and adds current path when it is a gcloud storage https url
We want to copy files from Google Storage to Azure Storage. We used following this guide: https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-google-cloud
We run this command:
azcopy copy 'https://storage.googleapis.com/telia-ddi-delivery-plaace/activity_daily_al1_20min/' 'https://plaacedatalakegen2.blob.core.windows.net/teliamovement?<SASKEY>' --recursive=true
And get this resulting error:
INFO: Scanning... INFO: Any empty folders will not be processed, because source and/or destination doesn't have full folder support failed to perform copy command due to error: cannot start job due to error: cannot scan the path /Users/peder/Downloads/https:/storage.googleapis.com/telia-ddi-delivery-plaace/activity_daily_al1_20min, please verify that it is a valid.
It seems to us that azcopy interprets the source as a local file destination and therefore adds the current location we run it from which is: /Users/peder/Downloads/. But we are unable to find any arguments to indicate that it is a web location and it is identical to the documentation in this guide:
azcopy copy 'https://storage.cloud.google.com/mybucket/mydirectory' 'https://mystorageaccount.blob.core.windows.net/mycontainer/mydirectory' --recursive=true
What we have tried:
- We are doing this on a Mac in Terminal, but we also tested PowerShell for Mac.
- We have tried single and double quotes.
- We copied the Azure Storage url with SAS key from the console to ensure that has correct syntax
- We tried cp instead of copy as the help page for azcopy used that.
Is there anything wrong with our command? Or can it be that azcopy has been changed since the guide was written?
I also created an issue for this on the Azure Documentation git page: https://github.com/MicrosoftDocs/azure-docs/issues/78890
The reason you're running into this issue is because the URL
storage.cloud.google.comis hardcoded in the application source code for Google Cloud Storage. From this
const gcpHostPattern = "^storage.cloud.google.com" const invalidGCPURLErrorMessage = "Invalid GCP URL" const gcpEssentialHostPart = "google.com"
Since you're using
storage.cloud.google.com, it is not recognized by azcopy as a valid Google Cloud Storage endpoint and it considers the value as one of the directories in your local file system.