I'm running into more clients where there are more than 2B records in a table. The problem is that Datagarage.dbo.dgTargetSourceTableRowCount stores the value of records in a table in an INT (limited to 2,147,483,647) so I have to change the column to a BIGINT. Also, the SP [DataGarage].[dbo].[webService_CompletedTableDownload] needs to be updated to use COUNT_BIG(*) instead of COUNT(*), as well as other references to table count where they are stored as an INT vs. a BIGINT...and this only fixes 2/3 of the problem.
(related ask, while we are updating these tables can we update the datetime values to datetime2? The data conversions are implicit so the impact should be minimal but then they tables will be to Microsoft's recommended data type and support finer granularity when it's needed)
The bigger issue is that the count of records on a table is unqualified (SELECT COUNT(*)) so it has to do a full table scan of all the records on the table...which sucks for performance. On a high performing SQL Server I cancelled the count of 3B records after 9 minutes of trying to run....I don't have a good suggestion on how to fix that one...
Jake - thanks for raising this idea. As we deal with larger datasets, we need to be able to effectively handle record counts in collect. I'll add this into the DSP backlog and align to a future DSP release.
Tyler,
This SQL should be made dynamic and will handle Billions of rows in less than 10 seconds.
SELECT INFORMATION_SCHEMA.TABLES.TABLE_CATALOG AS Target,
INFORMATION_SCHEMA.TABLES.TABLE_NAME AS [Table], sys.sysindexes.rowcnt AS RecordCount
FROM sys.sysobjects WITH (NOLOCK) INNER JOIN sys.sysindexes WITH (NOLOCK) ON sys.sysobjects.id = sys.sysindexes.id INNER JOIN
INFORMATION_SCHEMA.TABLES WITH (NOLOCK) ON sys.sysobjects.name = INFORMATION_SCHEMA.TABLES.TABLE_NAME
WHERE (sys.sysobjects.xtype = 'U') AND (sys.sysindexes.indid IN (0, 1)) AND INFORMATION_SCHEMA.TABLES.TABLE_NAME = 'LFA1'
Additional DSW/Transform tables with RecordCount field that should be made BIGINT.
Additionally the LLBLGen entities the Collect/Transform dlls need to be updated as well.
ttWaveProcessAreaObjectTarget
ttWaveProcessAreaObjectTargetSource
dswMilestoneObject
dswSourceTable
ttReportRun
ttWaveProcessAreaObjectTargetDataServicesReport
ttWaveProcessAreaObjectTargetDataServicesReportRemediation
ttWaveProcessAreaObjectTargetDataServicesRule
ttWaveProcessAreaObjectTargetDataServicesRuleAudit
ttWaveProcessAreaObjectTargetExport
ttWaveProcessAreaObjectTargetReport
ttWaveProcessAreaObjectTargetReportRemediation
ttWaveProcessAreaObjectTargetRule
ttWaveProcessAreaObjectTargetRuleAudit
ttWaveProcessAreaObjectTargetSourceReport
ttWaveProcessAreaObjectTargetSourceRule
ttWaveProcessAreaObjectTargetSourceRuleAudit
and their matching store procedures
aspTargetyyyyMetricsUpdPrm
For the TxDot project these tables needed RecordCount/TableCount changed to BIGINT:
DataGarage db tables:
dgTargetSourceTable
dgTargetSourceTableMetric
dgTargetSourceTableRowCount
dgTargetSourceTableRule
DSW db tables:
ttWaveProcessAreaObjectTarget
ttWaveProcessAreaObjectTargetSource
These stored procs needed COUNT(*) changed to COUNT_BIG(*)
but a better solution would be to use a version of the attached stored proc (boaGetTableRowCount) to get the table row count instead of using COUNT_BIG(*).
DataGarage db stored procs:
webService_CompletedBuildIndex
webService_CompletedDBMotoDownload
webService_CompletedTableDownload
webService_FailedBuildIndex
webService_FailedDBMotoDownload
webService_FailedTableDownload
DSW db stored procs:
aspTargetMetricsUpdPrm
aspTargetMetricsUpdPrm
Datagarage.dbo.webTargetSourceTableSetupHor has a CONVERT in the SELECT statement that needs to be updated to the datatype BIGINT:
CONVERT (BIGINT, ISNULL(dbo.dgTargetSourceTable.RecordCount, 0))
I like how sp_spaceused 'tablename' works to pull record counts, maybe it's similar.
The row count should be retrieved from the sysindexes table, as follows:
SELECT TOP (100) PERCENT INFORMATION_SCHEMA.TABLES.TABLE_CATALOG, INFORMATION_SCHEMA.TABLES.TABLE_NAME, sys.sysindexes.rowcnt AS RecordCount
FROM sys.sysobjects INNER JOIN
sys.sysindexes ON sys.sysobjects.id = sys.sysindexes.id INNER JOIN
INFORMATION_SCHEMA.TABLES ON sys.sysobjects.name = INFORMATION_SCHEMA.TABLES.TABLE_NAME
WHERE (sys.sysobjects.xtype = 'U') AND (sys.sysindexes.indid IN (0, 1))