azureBlobStorage Table Function
Provides a table-like interface to select/insert files in Azure Blob Storage. This table function is similar to the s3 function.
Syntax
azureBlobStorage(- connection_string|storage_account_url, container_name, blobpath, [account_name, account_key, format, compression, structure])
Arguments
connection_string|storage_account_url
— connection_string includes account name & key (Create connection string) or you could also provide the storage account url here and account name & account key as separate parameters (see parameters account_name & account_key)container_name
- Container nameblobpath
- file path. Supports following wildcards in readonly mode:*
,**
,?
,{abc,def}
and{N..M}
whereN
,M
— numbers,'abc'
,'def'
— strings.account_name
- if storage_account_url is used, then account name can be specified hereaccount_key
- if storage_account_url is used, then account key can be specified hereformat
— The format of the file.compression
— Supported values:none
,gzip/gz
,brotli/br
,xz/LZMA
,zstd/zst
. By default, it will autodetect compression by file extension. (same as setting toauto
).structure
— Structure of the table. Format'column1_name column1_type, column2_name column2_type, ...'
.
Returned value
A table with the specified structure for reading or writing data in the specified file.
Examples
Write data into azure blob storage using the following :
INSERT INTO TABLE FUNCTION azureBlobStorage('http://azurite1:10000/devstoreaccount1',
'test_container', 'test_{_partition_id}.csv', 'devstoreaccount1', 'Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==',
'CSV', 'auto', 'column1 UInt32, column2 UInt32, column3 UInt32') PARTITION BY column3 VALUES (1, 2, 3), (3, 2, 1), (78, 43, 3);
And then it can be read using
SELECT * FROM azureBlobStorage('http://azurite1:10000/devstoreaccount1',
'test_container', 'test_1.csv', 'devstoreaccount1', 'Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==',
'CSV', 'auto', 'column1 UInt32, column2 UInt32, column3 UInt32');
┌───column1─┬────column2─┬───column3─┐
│ 3 │ 2 │ 1 │
└───────────┴────────────┴───────────┘
or using connection_string
SELECT count(*) FROM azureBlobStorage('DefaultEndpointsProtocol=https;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;EndPointSuffix=core.windows.net',
'test_container', 'test_3.csv', 'CSV', 'auto' , 'column1 UInt32, column2 UInt32, column3 UInt32');
┌─count()─┐
│ 2 │
└─────────┘
Virtual Columns
_path
— Path to the file. Type:LowCardinalty(String)
._file
— Name of the file. Type:LowCardinalty(String)
._size
— Size of the file in bytes. Type:Nullable(UInt64)
. If the file size is unknown, the value isNULL
._time
— Last modified time of the file. Type:Nullable(DateTime)
. If the time is unknown, the value isNULL
.
See Also
Hive-style partitioning
When setting use_hive_partitioning
is set to 1, ClickHouse will detect Hive-style partitioning in the path (/name=value/
) and will allow to use partition columns as virtual columns in the query. These virtual columns will have the same names as in the partitioned path, but starting with _
.
Example
Use virtual column, created with Hive-style partitioning
SET use_hive_partitioning = 1;
SELECT * from azureBlobStorage(config, storage_account_url='...', container='...', blob_path='http://data/path/date=*/country=*/code=*/*.parquet') where _date > '2020-01-01' and _country = 'Netherlands' and _code = 42;