Add Binance UM data collector (1min/60min/1d) #2070
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
feat(data_collector): add Binance USDⓈ-M perpetual futures collector
Description
This PR adds a new data collector for Binance USDⓈ-M perpetual futures (
scripts/data_collector/binance_um/), supporting three frequencies: 1min, 60min (1h), and 1d.The collector follows Qlib's standard
BaseCollector/BaseNormalize/BaseRunpattern and provides:data.binance.visionand converts them to per-symbol CSVs/fapi/v1/klines) with automatic resume from existing CSV tailamount(mapped fromquote_volume, representing USDT notional turnover)vwap(computed asamount / volume)trades,taker_buy_volume,taker_buy_amount.binformat viadump_bin.pyInstrument naming uses prefix
binance_um.(e.g.,binance_um.BTCUSDT) to avoid conflicts with other data sources.Motivation and Context
Binance USDⓈ-M perpetual futures is a major crypto derivatives market. This collector enables Qlib users to:
The implementation aligns with existing collectors (yahoo, tushare, crypto) for consistency and maintainability.
How Has This Been Tested?
Tested on local venv with real Binance data:
1min frequency (ZIP → CSV → normalize → dump):
1m).binqlib.data.D.features()withfreq='1min'60min frequency (ZIP → CSV → normalize → dump):
1h)freq='60min'(744 rows)Daily frequency (ZIP → CSV → normalize → dump):
1d)freq='day'(31 rows)Test commands used:
Note: REST API incremental fetching was tested but encountered HTTP 451 (region restriction) in the test environment. The code handles this gracefully by returning empty DataFrame and logging warnings. ZIP-based historical data collection works reliably.
Types of changes