tsunami

log in
history

bcp utility gotchas

Luke Breuer
2010-04-13 18:52 UTC

Request for help
Let me know about any incorrect or missing information, so that this page can be a most excellent reference for bcp error messages and whatnot.
Notes
Note: this information is for SQL Server 2000 bcp.exe unless otherwise noted. Newer versions may not exhibit some of the below behavior.

Warning: bcp does not warn you if you export data as tab-delimited when the data being pulled has tabs embedded (in character-type fields). This means data exported cannot automatically be imported.

Note: bcp may silently succeed in importing data from a file with more columns than the destination table, if the last column in the table (or format file) can contain the extra fields in the file, delimiters included.
References
  • Understanding Non-XML Format Files (for SQL2008, but useful as a reference)
  • Understanding XML Format Files
    • While Microsoft claims that XML format files are better, I see no indication of this: once one knows which columns specify which data, the Non-XML format files are perfectly easy to parse. I suppose if the error messages were better, this could be a reason to switch...
Format Files
In v8.0 format files, there are two sets of ordinal positions: one for the data file field order and one for the table column order. There are two ways to retrieve the latter's ordinal positions: INFORMATION_SCHEMA.COLUMNS.ORDINAL_POSITION and sp_columns. COLUMNS retrieves its ORDINAL_POSITION from syscolumns.colid, while sp_columns retrieves its ORDINAL_POSITION via count(*) where < colid. Because colid values are not required to be consecutive, these two ORDINAL_POSITION values can be different. bcp appears to use the sp_columns method. Modifying the table by rearranging columns or adding and removing a new column seems to fix the problem.

To get the sp_columns version of ORDINAL_POSITION from COLUMNS, use SQL like
select  COLUMN_NAME, 
        (select  count(*) + 1
        from    INFORMATION_SCHEMA.COLUMNS tx
        where   tx.TABLE_NAME = t1.TABLE_NAME 
            and tx.ORDINAL_POSITION < t1.ORDINAL_POSITION) ORDINAL_POSITION
from    INFORMATION_SCHEMA.COLUMNS t1
where   TABLE_NAME = 'your_table'
Error Messages
Getting table ordinal positions wrong in the format file, whether it be because of the above ordinal position discrepancy or other error, can result in a multitude of error messages:
  • string right truncation
  • restricted data type
  • invalid cast
  • invalid descriptor index
  • unexpected EOF
String Right Truncation
  • see section immediately following Error Messages
  • some text in the data is too large for the destination column
  • a numeric column in the database does not have enough length and/or precision
  • This can happen if there are more columns in the data file than the destination table. If this is the case, bcp will interpret the column delimiters, that occur after the last column in the table, as actual data. It will be likely that the last column in the table won't be large enough to handle all the data, so truncation will happen and this error will be returned.
Unexpected EOF encountered in BCP data-file.
  • see section immediately following Error Messages
  • If the -c switch is used and the line endings are not \r\n, bcp might never find the end of the first record.
  • If the -c switch is used and your field delimiter is something other than \t, ensure you specify the proper delimiter with the -t argument.
  • If the data file is not ANSI (8 bits per character), bcp has issues reading it and tends to spit out the above error. (Although, perhaps using the -w switch would also work.)
  • From Field Length in Books Online : Using default data type sizes (field length) can lead to an "unexpected end of file" error message. This generally occurs with the money and datetime data types when only part of the field occurs in the data file (for example, a datetime value of mm/dd/yy with no time component) rather than an entire string, as expected by SQL Server. When using the default size option, SQL Server expects to read 24 characters (the length of the datetime data type when stored in char format). To avoid this problem, bulk copy data using field terminators, or fixed-length data fields. Or, format the datetime as "yyyy-MM-dd HH:mm:ss.fff".
Invalid column number in the format file
  • Ensure that your "host file field order" is consistent: the first row (excluding the version number and # of columns) needs to be 1, the next 2, etc.: no gaps and no repetitions are allowed.
  • This error may also show up if the "server column order" is incorrect, which could be due to:
    • Specifying a column that doesn't exist in the table (0 is OK, if you aren't uploading that column).
    • Having an invalidly formatted/escaped field terminator, such that the character(s) that got parsed as the "server column order" was not what you intended.
  • Ensure that your column names do not have any spaces in them.
Access Violation (error 0xC00000005)
  • The field length field must always be populated correctly when importing, as the bcp utility sizes its buffers according to this value.
Restricted data type
  • see section immediately following Error Messages
  • If exporting with a format file, ensure all columns are being exported. If only some columns need to be exported, use queryout.
Invalid collation name for source column 4 in format file '...'.
  • data files must end in at least a carriage return if not \r\n
Unable to open BCP host data-file
  • at least in BCP 9.0 (SQL Server 2005), this error will occur if the format file cannot be opened
  • ensure the data file is not locked for editing (MS Excel does this)
I/O error while reading BCP format file
  • format files must end with a newline
Invalid character value for cast specification
The value bcp attempted to upload to a column was the wrong data type. In the simplest cases, this is due to uploading a non-integer value into an integer column, bad date formats, etc. However, it can also occur if {number of columns in data file} != {number of columns in destination table/format file}. A good way to see if there is a column # mismatch is if the error file spit out (you did use -e error.txt, right?) looks like this:
#@ Row 1, Column 9: Invalid character value for cast specification @#
2   0300    HC:36415    21  1   0   <NULL>    20070509    2
3   0300    HC:36415    21  1   0       20070608    2
#@ Row 2, Column 9: Invalid character value for cast specification @#
4   0301    HC:80076    366 1   0   <NULL>    20070509    2
5   0301    HC:80076    366 1   0       20070608    2
Notice that what bcp thinks two rows of data constitute one row to upload to the database. This is because bcp knows that the column delimiters are {tab}, except for the last one, which is {newline} — if it is expecting another column, it'll read right past a {newline}.