-
Notifications
You must be signed in to change notification settings - Fork 420
feat(block/gs): report pre-signed URL expiry for GCS #9927
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Return actual expiry time from GetPreSignedURL in GCS adapter to match S3 implementation. Part of #6347.
| if config.PreSignSupport { | ||
| goldenFile = "lakectl_stat_pre_sign" | ||
| if config.BlockstoreType == "s3" { | ||
| if config.BlockstoreType == "s3" || config.BlockstoreType == "gs" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven’t been able to run this test locally. Please do tell me if I am doing something wrong or missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements pre-signed URL expiry time reporting for the GCS adapter to match the S3 implementation. Previously, the GCS adapter returned a zero time value for expiry, and now it returns the actual expiry time configured for the signed URL.
- Updated
GetPreSignedURLin GCS adapter to return the signed URL's expiry time - Extended test expectations to include GCS alongside S3 for expiry time validation
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| pkg/block/gs/adapter.go | Returns the actual expiry time (opts.Expires) instead of zero time, removing the TODO comment |
| esti/lakectl_test.go | Updated test conditions to expect expiry reporting for both S3 and GCS blockstore types |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
arielshaqed
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Not sure this returns the correct value; please see comment below and the corresponding implementation in the S3 block adapter.
| } | ||
| // TODO(#6347): Report expiry. | ||
| return k, time.Time{}, nil | ||
| return k, opts.Expires, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is the required value. The returned time here should be the actual expiration time of the generated URL. This can differ from what gs.newPreSignedTime returns! For instance if the current short-term credentials will expire before a.preSignedExpiry then (AFAIK, from behaviour of S3 - but no reason for GCP to differ) the URL will have a shorter expiration time. It is this expiration time we need to emit here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that behaviour is only limited to AWS S3, where they separate authentication and authorization. Reading the docs, I don't see any mention that the short-lived credentials affect the expiration time.
Some additional references:
- Revoke the signed URL of the object that was created in GCP Cloud storage - Stack Overflow
- Also asked ampcode's oracle to research
I don’t work with GCS frequently, so I may be wrong. If you have sources that state otherwise, please share.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arielshaqed, I see on https://docs.lakefs.io/v1.73/reference/configuration/#blockstoregs that it requires a Service account key, which I believe does not expire.
Presigned url is tied to a service account rather than a session like in AWS.
From the docs (emphasis mine):
When you generate a signed URL, you must specify an account that has sufficient permission to make the request that the signed URL will make.
In most cases, the account is a service account.
In cases where you create your own program to generate signed URLs, it's possible to use a user account, if it has an associated HMAC key.
I don't think you can even create a signed url with a token, at least I get a different error when I try in Python.
AttributeError: you need a private key to sign credentials.the credentials you are currently using <class 'google.oauth2.credentials.Credentials'> just contains a token. see https://googleapis.dev/python/google-api-core/latest/auth.html#setting-up-a-service-account for more details.
I did try to create an impersonated credentials that expires before a presigned url and the upload worked fine:
Script to create presigned url from impersonated credentials
#!/usr/bin/env python3
"""Test GCS presigned URL with short-lived credentials."""
import time
import datetime
import requests
from google.cloud import storage
from google.auth import impersonated_credentials
import google.auth
import google.auth.transport.requests
SERVICE_ACCOUNT_EMAIL = "lakefs-saugat@lakefs.iam.gserviceaccount.com"
BUCKET_NAME = "saugat-local-testing"
def get_short_lived_credentials(lifetime_seconds=30):
"""Create impersonated credentials with short lifetime."""
source_credentials, _ = google.auth.default()
credentials = impersonated_credentials.Credentials(
source_credentials=source_credentials,
target_principal=SERVICE_ACCOUNT_EMAIL,
target_scopes=["https://www.googleapis.com/auth/cloud-platform"],
lifetime=lifetime_seconds,
)
credentials.refresh(google.auth.transport.requests.Request())
return credentials
def create_presigned_url(credentials, object_name, expiry_seconds=300):
"""Create a presigned PUT URL using the given credentials."""
client = storage.Client(credentials=credentials)
blob = client.bucket(BUCKET_NAME).blob(object_name)
return blob.generate_signed_url(
version="v4",
expiration=datetime.timedelta(seconds=expiry_seconds),
method="PUT",
content_type="text/plain",
)
def upload_to_url(url, content):
"""Upload content to presigned URL. Returns status code."""
r = requests.put(url, data=content, headers={"Content-Type": "text/plain"})
return r.status_code
if __name__ == "__main__":
import time
creds = get_short_lived_credentials(10)
print(f"Credentials expire: {creds.expiry}")
obj = f"test-{int(time.time())}.txt"
url = create_presigned_url(creds, obj, 60)
print(f"Created URL for: {obj}")
time.sleep(15)
ret = upload_to_url(url, "test1")
print(f"Upload 2 (after cred expiry): {ret}")$ python3 ./gcs_presign_test.py
Credentials expire: 2025-12-31 07:52:38
Created URL for: test-1767167548.txt
Upload 2 (after cred expiry): 200
Return actual expiry time from GetPreSignedURL in GCS adapter to match S3 implementation.
Part of #6347.
In GCS, it seems that the signed URL expiry is independent of credentials' expiry (compared to AWS temporary credentials), so we can just return the expiry time we set on the signed URL.