Skip to main content

Distinct Counts

This applies to: Visual Data Discovery

Distinct count functionality determines the number of unique values in a column or expression within a selected table by comparing all the records pulled from the data store by a data source configuration. When distinct counts are used, unique value results are returned when analyzing data. For example, distinct counts could return the number of:

  • Unique customers in a sales database

  • Unique UPC codes for a category of products

  • The number of trucks in a company's fleet

For example, given a single collection and string field with the following three values:

  1. Apple
  2. Orange
  3. Apple

The distinct count returns 2, since there are only two distinct values (“Apple” and “Orange”), while an ordinary count returns 3 to reflect the total number of records. SQL-based connectors might produce a query that looks like this:

select count(distinct myField) from myCollection

Support for this feature by connector is shown in the following table.

Key:Y - Supported; N - Not Supported; N/A - not applicable

ConnectorSupported?Notes
Amazon RedshiftY 
Amazon S3Y 
Apache DrillY 
Apache PhoenixY 
Apache Phoenix Query Server (QS)Y
Apache SolrY 
BigQueryY

If you need to access a BigQuery partition, explicitly include an alias for the built in partition column in your select clause, such as select *, _PARTITIONTIME as pt from projectId.datasetId.tableId.

Cloudera ImpalaY

Cloudera Impala connectors can receive only a single distinct count field in a query.

Cloudera SearchY 
CouchbaseY 
DremioY 
Elasticsearch 7.0Y 
Elasticsearch 8.0Y
File UploadY 
HDFSY 
HiveY 
JiraY 
MemSQLY 
Microsoft SQL ServerY 
MongoDBY 
MySQLY 
OracleY 
PostgreSQLY 
PythonY 
Real Time SalesY 
SalesforceY 
SAP HanaY 
SAP IQY 
Spark SQLY 
SnowflakeY 
TeradataY 
TIBCO DVY 
TrinoY 
File Upload (Upload API)Y 
VerticaY 

Was this article helpful?

We're sorry to hear that.

Powered by Zendesk