i have following stored procedure, assesses accounts in staging table , decides if suitable import or not. if flag them suitabletoimport = true. if not give reason why not.
although it's set based still slow. i've tried moving exist instead of count, testing not seem indicate make of difference.
any recommendation on done?
create or replace function assessinclusionofaccountsfromstaging () returns boolean $$ /*only new accounts valid, detailed issues checking , budge checking duplicates*/ declare countofaccountsinstaging integer; begin /*check have data process*/ countofaccountsinstaging = count(*) importaccountsstaging; if(countofaccountsinstaging) = 0 raise exception 'no accounts available'; end if; /*set suitabletoimport*/ raise notice 'processing accounts...'; update importaccountsstaging set suitabletoimport = true , ((select count (*) importaccountsstaging accountsiterated /*check duplicates against staging enviroment @ org level*/ (accountsiterated.code1 = importaccountsstaging.code1) or (accountsiterated.code2 = importaccountsstaging.code2) )=1) /*check duplicate in masterdb*/ , ((select count (*) masteraccounts /*check potential duplicate @ org level*/ (importaccountsstaging.code1 = masteraccounts.code1 ) or (importaccountsstaging.code2 = masteraccounts.code2 ) )=0) ; /*set comment on why it's not suitable import*/ update importaccountsstaging set reason = concat(reason , 'existing account in staging|') not ((select count (*) importaccountsstaging tempaccounts tempaccounts.code1 = importaccountsstaging.code1 or tempaccounts.code2 = importaccountsstaging.code2 )=1); /*set comment on why it's not suitable import*/ update importaccountsstaging set reason = concat(reason , 'existing account in main|') not ((select count (*) masteraccounts importaccountsstaging.code1 = masteraccounts.code1 or importaccountsstaging.code2 = masteraccounts.code2 )=0) ; /*return values*/ raise notice 'assessment completed human! '; return true; end; $$ language plpgsql;
thanks lot!
it known antipattern - count(*)
can pretty slow operation because have scan possible rows. test based on exists
should pretty fast, because execution stops on first row. newer use count test if exists or not! use exists
always.
Comments
Post a Comment