How to Estimate maximum size of new column in a table and check actual size of existing column in a table in ORACLE

I have described here how to estimate the maximum size of a new column and check the actual size of exiting column. You might know it already then ignore it otherwise might be interesting to know.

 

How to estimate data in new column when adding new table:


Table 1: Message_TRANSACTION_LOG


select C.COLUMN_NAME,C.DATA_LENGTH from all_tab_columns c

where C.TABLE_NAME='MESSAGE_TRANSACTION_LOG'

order by C.COLUMN_NAME;

COLUMN_NAME

DATA_LENGTH

ACCOUNTNAME

50

CAMPAIGNNAME

120

CONSUMERMARKETID

2

CONSUMEROPERATORID

22

CONSUMERTIMESTAMP

7

CONTENTPROVIDERID

22

CONTENTPROVIDERMARKETID

2

CONTENTPROVIDERTIMESTAMP

7

COST

22

CREATED

7

DEFAULTREPORTFLAGS

22

DESTINATIONADDRESS

256

DISTRIBUTIONLISTID

22

EXPORTSTATUS

22

MESSAGELOGID

22

MESSAGERECONCILIATIONID

255

MESSAGETYPE

1

MTACCOUNTID

22

OPERATORDISCOVERYMETHOD

20

OPERATORMARKETID

2

OPERATORMESSAGEID

150

OPERATORTIMESTAMP

7

ORIGINATINGADDRESS

256

ORIGINATINGADDRESSTON

22

SIZEINKB

22

STATUSREPORTFLAGS

22

TRAFFICCLASSID

50

TRANSACTIONID

22

USERDATA

4 000

WASSENTASYNC

22

WASUSINGALIAS

22

WASWEBPASSCODESENT

22

Table 2: Column size in bytes in table MESSAGE_TRANSACTION_LOG

 

Now let’s consider two columns CONSUMEROPERATORID and USERDATA here. From table 1 we can see the type of CONSUMEROPERATORID is number and USERDATA is varchar2(1024 char) and from table 2 we can see the maximum byte taken by these columns are 22 bytes and 4000 bytes. So if we consider the maximum size of these columns it should be:

Column Name

Length of the column in bytes

Size of the column in bytes

CONSUMEROPERATORID

22

22 * number of rows

USERDATA

4000

4000 * number of rows

Table 3: Estimated maximum size of the columns

 

Now anybody may wonder why the byte we consider 22 for CONSUMEROPERATORID and 4000 for USERDATA here where some number take 2 byte and single character take 1 byte. These sizes depend based on the character set and most of the database use character set %UTF8 now as we are using. In UTF8 character set every character can take up to 4 bytes (e.g if the character in some other language instead of English). Now you ask then for USERDATA it should be 4096(1024*4) but here it’s showing 4000 since the limit of varchar2 is 4000 bytes. Number can take 2 to 22 bytes. E.g

         X          Y   VSIZE(X)   VSIZE(Y)

---------- ---------- ---------- ----------

        99        100          2          2

      9999      10000          3          2

    999999    1000000          4          2

  99999999  100000000          5          2

9999999999 1.0000E+10          6          2

1.0000E+12 1.0000E+12          7          2

1.0000E+14 1.0000E+14          8          2

1.0000E+16 1.0000E+16          9          2

1.0000E+18 1.0000E+18         10          2

1.0000E+20 1.0000E+20         11          2

1.0000E+22 1.0000E+22         12          2

1.0000E+24 1.0000E+24         13          2

1.0000E+26 1.0000E+26         14          2

1.0000E+28 1.0000E+28         15          2

1.0000E+30 1.0000E+30         16          2

1.0000E+32 1.0000E+32         17          2

1.0000E+34 1.0000E+34         18          2

1.0000E+36 1.0000E+36         19          2

1.0000E+38 1.0000E+38         20          2

 

How to check the actual size of column in a table:

There is data dictionary view to check total size of a table, database, datafile, index, lob segment etc but there is no dictionary view to check the size of column as far I know L . It’s easy to get information from dictionary view since these views give the information from statistics and it takes less time.

But there is way to get the actual size of column by using a function vsize. vsize return the number of bytes.

Lest take a small table for example, here I am checking the estimated maximum size and actual size of column TARIFFCLASSGROUPNAMEID in CDR_ACCOUNT table:

select * from CDR_ACCOUNT;

CDRACCOUNTID

ACCOUNTID

TARIFFCLASSID

TARIFFCLASSGROUPNAMEID

1 045 201

10 452

se-large-premium

1 210 401

12 104

SEK555

1 200 401

12 004

SEK666

1 036 201

10 362

SEK667

1 030 501

10 305

SEK1000

1 060 201

10 602

SEK1000

1 070 201

10 702

SEK100

1 050 302

10 503

SEK1000VAT065

1 050 301

10 503

SEK1000

1 045 202

10 452

SEK859

1 030 502

10 305

se-large-premium

1 050 303

10 503

se-large-premium

1 230 201

12 302

SEK777

 

select count(*) from CDR_ACCOUNT;

COUNT(*)

13

 

Estimated size in bytes:

select C.COLUMN_NAME,C.DATA_LENGTH "Maximum size in Bytes"

from all_tab_columns c

where C.TABLE_NAME='CDR_ACCOUNT'

order by C.COLUMN_NAME;

 

COLUMN_NAME

Maximum size in Bytes

ACCOUNTID

22

CDRACCOUNTID

22

TARIFFCLASSGROUPNAMEID

256

TARIFFCLASSID

40

 

So estimated maximum size for column TARIFFCLASSGROUPNAMEID  should be 256*13=3328  (Here considering value in every rows)

Actual size in bytes:

select sum(vsize(TARIFFCLASSGROUPNAMEID)) from CDR_ACCOUNT;

SUM(VSIZE(TARIFFCLASSGROUPNAMEID))

48

 

So the estimated maximum size of TARIFFCLASSGROUPNAMEID is 3328 where actual size is 48 here.

 

Note: Since vsize and sum are functions and not getting the information from statistics so to get the size of column in big table will take time and this thing should not be execute in big table in production database without notifying DBA. But by giving date/timestamp in where condition in big table you can check the actual size of a column for a day or a week then you can guess for a month or year. But then also need to notify DBA.

 

Comments