How to utilize both KV and TiFlash to solve a single query? #65103
-
|
I have a wide table: 50 varchar(50) columns, 100 million rows. If I query all columns using TiFlash If I query the key column only using TiFlash If I query the whole row using this key (TiKV) So theoretically query can be executed for 0.631 sec + 0.006 sec = 0.637 sec But it works for 52.929 sec So for now my options are:
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 5 replies
-
|
TiFlash support later materialization, so in the ideal case will first only read the For the second query I believe that TiDB's optimize convert the |
Beta Was this translation helpful? Give feedback.
-
|
In 8.5.3 - I expect that TiDB has tidb_analyze_column_options=PREDICATE. Which means - if you run ANALYZE (either via auto-analyze or manually) before TiDB is registered which columns are involved in local or join predicates, statistics may not have been collected on the column. When set to PREDICATE - statistics are only collected on indexed columns (regardless of their predicate usage) or columns registered as having predicates in queries. This can be an issue for OLAP benchmark tests with TiFlash, but less so for OLTP workload benchmarks (since critical columns are typically indexed), or for "mature" applications (where all predicates have been seen. You can confirm if you have statistics collected on a column by issuing a SHOW STATS_HISTOGRAMS; If you don't have statistics on the attr_14 column - you can change the variable to ALL (and rerun ANALYZE), or you can use "ALL COLUMNS" option of ANALYZE. So before trying to hint or change the query - I would check the statistics on the columns. Once that is ruled out - then I would look at other reasons/issues. |
Beta Was this translation helpful? Give feedback.
-
|
@den-crane And also, for I notice that there is a warning, can you use |
Beta Was this translation helpful? Give feedback.
Although tidb_opt_enable_late_materialization is on by default, as I mentioned before, which predicate to pushdown to TableScan is totally cost decided, and in you case the optimizer decide not to pushdown to TableScan.
One of the reason is that in TiDB, we does estimate the filter rate for complex filter like
like, currently we simply use 0.8 as the match rate. There is another variable to control the math rate:tidb_default_string_match_selectivity, as workaround, you can also try this query