Getting CSV File Data into SQL Azure

I have been using a trial to see if SQL Azure might be a good place to store regular data extractions for some of my auto dealership clients. I have always wanted to develop using SQL Server because of the (it seems) universal connectivity ability. Nearly all the web development frameworks I use make connection to SQL Server.

So I decided to give it a try and after building a vehiclesalescurrent database as follows:

I then researched the proper way to load the table. BULK INSERT is not supported in SQL Azure; only the Bulk Copy Program (bcp Utility) is available to import csv files. The bcp Utiltiy, being a command line program is not very intuitive. It is rather powerful though as it can do import (in), export (out) and create format files. Whatever possible switch you can think of, it has. I forged ahead and tried to do the import without a format file (which is possible apparently). The Format File is an XML file that tells bcp how the csv file maps to the table in the database.  I received error after error mostly relating to cast issues and  invalid end of record. I was under the impression that a csv file had a rather common end of record code known as CRLF or carraige return/line feed. I opened my csv file in notepad++ with the view codes option on to make sure there wasn’t anything unusual going on. There wasn’t.

Having no success the easy way I decided to create a Format file which would tell bcp what to look for definitively as an end of record. The bcp utility will accept either an XML or non-XML Format file and bcp utility will create either for you. I chose the XML format file because I just felt it might be less buggy. This spit out a file easily but I still had to make a few modifications to the resulting XML file. In particular, bcp got the column separator wrong (“\t” for tab) and I changed it to a comma (“\,” ) as the file was csv. Also the last column of data, in my case column 21 needed the terminator “\r\n” which is the offending return and newline (CRLF) codes! Make sure the slashes are the right way; for some reason (I think I saw it in a blog post!) I put forward slashes and I needed the help desk to straighten me out. Anyway here is the bcp command to create an XML format file:
bcp MyDB.dbo.VehicleSalesCurrent format nul -c -x -f C:\JohnDonnelly\VehicleSalesCurrent.xml -U johndonnelly@xxxxxxxx -S tcp:xxxxxxx.database.windows.net -P mypassword

And here is the final correct Format file that you would  include in the same directory as the vehiclesalescurrent.txt data file when running the bcp utility to upload the data to SQL Azure:

<?xml version=”1.0″?>

<BCPFORMAT xmlns=”http://schemas.microsoft.com/sqlserver/2004/bulkload/format&#8221; xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”&gt;

<RECORD>

<FIELD ID=”1″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”12″/>

<FIELD ID=”2″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”24″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”3″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”24″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”4″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”24″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”5″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”16″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”6″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”11″/>

<FIELD ID=”7″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”16″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”8″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”100″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”9″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”100″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”10″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”100″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”11″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”12″/>

<FIELD ID=”12″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”11″/>

<FIELD ID=”13″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”12″/>

<FIELD ID=”14″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”30″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”15″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”100″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”16″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”100″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”17″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”40″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”18″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”12″/>

<FIELD ID=”19″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”34″ COLLATION=”SQL_Latin1_General_CP1_CI_AS”/>

<FIELD ID=”20″ xsi:type=”CharTerm” TERMINATOR=”,” MAX_LENGTH=”30″/>

<FIELD ID=”21″ xsi:type=”CharTerm” TERMINATOR=”\r\n” MAX_LENGTH=”12″/>

</RECORD>

<ROW>

<COLUMN SOURCE=”1″ NAME=”ID” xsi:type=”SQLINT”/>

<COLUMN SOURCE=”2″ NAME=”StockNo” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”3″ NAME=”DealType” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”4″ NAME=”SaleType” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”5″ NAME=”Branch” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”6″ NAME=”SalesDate” xsi:type=”SQLDATE”/>

<COLUMN SOURCE=”7″ NAME=”CustNo” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”8″ NAME=”Name” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”9″ NAME=”Address” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”10″ NAME=”City” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”11″ NAME=”Zip” xsi:type=”SQLINT”/>

<COLUMN SOURCE=”12″ NAME=”BirthDate” xsi:type=”SQLDATE”/>

<COLUMN SOURCE=”13″ NAME=”Year” xsi:type=”SQLINT”/>

<COLUMN SOURCE=”14″ NAME=”Make” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”15″ NAME=”Model” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”16″ NAME=”Body” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”17″ NAME=”Color” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”18″ NAME=”Mileage” xsi:type=”SQLINT”/>

<COLUMN SOURCE=”19″ NAME=”VIN” xsi:type=”SQLNVARCHAR”/>

<COLUMN SOURCE=”20″ NAME=”Down” xsi:type=”SQLMONEY”/>

<COLUMN SOURCE=”21″ NAME=”Term” xsi:type=”SQLINT”/>

</ROW>

</BCPFORMAT>

As you can see there is a section that defines what the bcp utility should expect regarding each of the fields and a section that defines what column name goes where in the receiving table.

As I mentioned above the Field Terminators are important (of course) and I found that bcp had them incorrect (“\t”) for a CSV file. Also it had the field terminator for the last field the same as all of the others fields (which was “\t”). This needs to be set to the CR LF codes. Also as I mentioned above the terminator on my last column (21) needed to have back slashes, and I think my google search yielded bad advice and I had forward slashes so:

<FIELD ID=”21″ xsi:type=”CharTerm” TERMINATOR=”/r/n” MAX_LENGTH=”12″/>

Obviously it should be:

<FIELD ID=”21″ xsi:type=”CharTerm” TERMINATOR=”\r\n” MAX_LENGTH=”12″/>

With the good Format FIle I ran the following bcp utility statement to handle the import:
# This one works, starts importing at Row 2, make sure dates are YYYY-MM-DD
bcp MyDB.dbo.VehicleSalesCurrent in C:\JohnDonnelly\VehicleSalesCurrent.csv -F 2 -f C:\JohnDonnelly\VehicleSalesCurrent.xml -U johndonnelly@xxxxxxx -S tcp:xxxxxxxx.database.windows.net -P mypassword -e C:\JohnDonnelly\err.txt

The -e switch throws out a nice err.txt file that gives you much more information about errors than the console does. For complete switch explanation follow the bcp Utility link above.

With the above Format File things got a little better however I had to open a case because I kept getting an “Invalid character value for cast specification” error which cited a SQLDate column type.

I learned from the help desk that the SQLDATE columns in my csv file needed to look like yyyy-mm-dd so I had to use a custom format each time if I opened the csv in Excel. You have to reformat the date columns as yyyy-mm-dd then save the file. That is the quick way anyway.

The support person said there was a bug in bcp relating to dates which leads to the last item and that is out of 9,665 rows there were 34 rows that wouldn’t move due to this error:

“Col 12 – Invalid character value for cast specification”

Column 12 was a birth date of SQLDATE type.

Furthermore 32 of the 34 rows were null in Column 12 so I agree with support that bcp Utility or SQL Azure is buggy with regards to SQLDATE.

I hope this helps someone battling the bcp utility with SQL Azure!

Converting to Outside Payroll

Got a call from a programmer who was helping a company convert a 5-store auto dealership company from Reynolds and Reynolds in-house payroll to an outside payroll service. The payroll company I had never heard of but they were very astute and specialized from what I could tell.

The impetus for the change seemed to be a time and attendance software vendor who appeared to use this payroll service company to to produce the payroll using data from his system. It could have been the other way around, the payroll company offered the Time and Attendance software as an important part of their service. I did not get a demo of the Time and Attendance system but the installer and I shared a remote PC connection and it seemed very comprehensive. Time and Attendance systems, if properly setup and managed (the critical part), saves dealerships money; serious money.

So the programmer who contacted me needed help extracting check history and YTD balance data from the Reynolds system. The programmer, Jim Barbour of Sequencing, Inc. was very¬†professional and managed the project very well. Jim needed me to pull all check history as well as pay, tax and deduction year to date totals for the employees. I gave him a quote that made sense and I went to work. I was successful pulling what I thought was all the data and I forwarded a file for his review. Upon verification by the client it was determined that we did not have it all. In particular there were a few deduction items on a several employees that were unaccounted for. Having experience with Reynolds file structures, I planned for this possibility and I had built time in for the possibility that we might need to do extra digging for the odd item. I found the offending miscellaneous deduction codes and related year-to-date totals buried in a multi-valued field called “I/D”. I extracted that data which was tricky because I wanted my condition statement to filter for only those values I was looking for in the multi-valued field. It didn’t work that way. Depending on the field attributes one may have to grab it all and filter programmatically. All was well although the flat file output was rather ugly. Jim being a good sport was able to select for and import the few codes into his conversion.

The interesting thing about this conversion was that the client needed an export for tech hours flagged to go each payday to the service bureau (old term I know, I did payroll in the early 80’s on IBM 360’s for a service bureau subsequently purchased by ADP called Stat-Tab……I was the payroll manager!) ¬†I wrote the export query and it worked out well however the concern on the client’s side was how to make sure the technician’s flags (time entries) are only paid once in cases where the flags are adjusted or if the RO is re-opened and closed again (as in possibly a warranty ticket). The answer was two-fold; one was to isolate flags by the Work Date field (in the query) rather than the RO date or any other date field. Work Date is the date the time entry is actually written, not the date the work is done or the RO was finished or closed. If we stick with the Work Date condition, during the pay cycle only those flags or adjustments done during the cycle get paid regardless of what goes on with the RO. In addition, the client had Body Shops and they wanted to pay flags (time) on ROs as the job was progressing regardless of whether or not the RO was closed in the pay cycle. Using Work Date allowed them to pay on work in progress as well.

The 2nd step in the process from the dealership’s side (I am an accountant, CPA actually but not current on my license) was to schedule the Work In Process – Labor (WIP) general ledger asset account (usually 247) by RO number. Many dealerships do this to control flagging irregularities and I believe the client had these schedules already in place. However, they became extremely important under the new outside payroll processing system. On the credit side of the of the WIP account will be the ROs total of cost of labor, i.e., all tech flags in total costed out on the RO when it was closed. On the debit side will be all the individual flags paid against that RO’s total costed labor. The RO control subtotal should always zero after closing the RO and the subsequent payroll is applied. The office, payroll clerk or office manager, must monitor the schedule often at the start of outside processing and every pay cycle thereafter for problems or irregularities. Techs are smart and some are inclined to “game” the system. Researching ROs with balances and resolving with either a back flag (a charge-back) or a correction to the RO cost and/or costing procedure is imperative so it doesn’t happen again.

All in all a workable solution for those who want to take time and attendance and payroll to the next step in these automobile dealership systems.

WebImage Helper and WebMatrix

Normally I don’t write about WebMatrix but I have been working on an issue for the better part of an hour and was unable to find anyone online that explained how to do this. So here it is and I hope the search engines grab it and it helps save someone some time.

I wanted to store images in my SQL CE database and I also wanted to use the WebImage helper to resize the uploaded images. I knew that datatype “WebImage” wouldn’t be acceptable input for an image column in SQL CE so I searched for help. I two people who helped me somewhat. Matt Osborne was great but he said putting the pictures in CE wasn’t as interesting as saving them to a folder. Arghhh! Anyway he also refered his readers to a “Simple Data Demo” which unfortunately I couldn’t find. His exact words were “Most of the time the API for saving to a database just take a Byte[] of the image, for instance see the Simple Data demo. The second one that save the image off to the file system is the much more interesting sample.” He was my only hope.

Then on Erik Reitan’s blog I found a listing of all the WebImage object properties and methods with no explanation. But that got me going and I was grateful. I found other blogs that gave specific instruction on how to save an uploaded file to a SQL database in the webmatrix context but I knew that the helper would have been designed to abstract away all that stream arcana and such. My problem was to simply upload a picture, resize it into a thumbnail and a small picture, and then store them both in my database.

I set forth 5 variables ‘photo’ to hold the upload which I typed as WebImage because a WebImage method was grabbing the file for me. A couple of other WebImage variables to hold the resized WebImage objects and 2 byte arrays to hold the result of the WebImage helper method GetBytes() which was the key to problem.

WebImage photo = null;
WebImage UserPicture = null;
WebImage UserThumbnail = null;
byte[] ProfileUserPicture = null;
byte[] ProfileUserThumbnail = null;

Below you will see the method that grabs the image file and that is GetImageFromRequest(“Image”).
I then do to resizes (which Matt Osbornes explains very well) and then I get the bytes and the db.Execute(SQLUPDATE) pulls both pictures right in.

photo = WebImage.GetImageFromRequest("Image");
if(photo != null){
UserPicture = photo.Clone().Resize(width: 400, height: 400, preserveAspectRatio: true, preventEnlarge: true);
UserThumbnail = photo.Clone().Resize(width: 128, height: 128, preserveAspectRatio: true, preventEnlarge: true);

ProfileUserPicture = UserPicture.GetBytes("jpeg");
ProfileUserThumbnail = UserThumbnail.GetBytes("jpeg");
}
else
{
ProfileUserPicture = new byte[0];
ProfileUserThumbnail = new byte[0];
}
var SQLUPDATE = "UPDATE Images SET UserPicture=@1, UserThumbnail=@2 WHERE UserID=@0";
var db = Database.Open("Database");
db.Execute(SQLUPDATE, ProfileUserID, ProfileUserPicture, ProfileUserThumbnail);

And, very important, you have to name the input tag the same as the parameter in the GetImageFromRequest() in this case “Image”.

So the input form markup will look like this:

<form action=”” method=”post” enctype=”multipart/form-data”>
<fieldset>

How about a nice picture?<input type=”file” name=”Image” />
</fieldset>
</form>

In summary, use the WebImage.GetBytes(“jpeg”) to convert to byte array and then upload.

3rd Party Data Extraction with Reynolds and Reynolds and ERAccess

I had a call from a 3rd Party vendor to the automobile dealership industry who was pulling data from their client’s Reynolds DMS’ and all of a sudden couldn’t do it anymore. Not wanting to become colossal pains to their clients they asked me to help. These are nice folks not pulling a high volume of data but important data none-the-less. Definitely not candidates for RCI. I had to do a little research to get myself up to speed as I hadn’t done much R&R work lately. See my last post about a previous conversation with the RCI team.

Installing the Reynolds Software Manager and ERAccess and is the key. To do that someone needs to help you. Otherwise you will probably be remoting into an ERAccess PC on the customer’s network and working there. Once we have access unfortunately we have to use the R&R proprietary Query (7601) or Report Generator programs to pull the data just like before. I’ve yet to find a 3rd Party extraction tool for UniVerse although it probably wouldn’t be hard to wire up PowerShell using the UniOLEDB Data Provider. The problem is someone with access to the Unix root would need to get to the UniVerse Server Admin and setup an account to show files. That’s not going to happen unless someone with an R&R system wants to experiment with me.

Since ADP and R&R use the same database (Currently Rocket UniVerse, formerly IBM UniVerse, originally PICK) their “Certified” interface would be a .Net data provider (hopefully) or XML service. I have a data provider for ADP but it crashes on Win7 systems. At any rate, we were able to solve the vendor’s problems going forward and get them back in the game. The goal now is to streamline this process with Rocket Software’s assistance and in this regard I will keep you posted.

Conversation with the RCI Desk at Reynolds and Reynolds

I recently had a request from a company to find out the current status of Reynolds’ approach to 3rd-party data access. Basically I found out that Reynolds decided to move from a telnet client to what this gentleman said was a “proprietary connection”.

In a nutshell ERALink is out and ERAccess is in, and it has been for some time apparently (I haven’t had a Reynolds client in a while). ERALink I believe, was really only intended for remote access (a telnet client) and the nice gentleman at Reynolds told me that now everyone that accesses ERA XT and ERA XTD systems uses ERAccess.

From the outset the Reynolds person knew that I developed custom Intranet (Sharepoint and others) solutions for automobile dealer groups. He knew I only needed to pull data and that there would never be any write-back to the DMS. Yet he steered me away from the RCI program and said that ERAccess was the way to go.

I’ve used ERALink and their Query Builder software extensively in the past and I really hoped that it wasn’t their best solution for a custom programmer like me, so I pushed on.

I asked what the RCI program provided to the 3rd-party vendor; he said that it was really for the guys that needed bi-directional read-write continuous access to the DMS. I said, “cool, but what do you offer? An ADO.Net data provider, XML Engine, what?” He said, “a set of XML APIs that will return the entire record associated with a data object (say a car or transaction) with no filtering ability.” He was really selling me on it NOT being for me.

So I said “ok, I send you a SOAP-like request and you send back some XML.” He said “yes”. I said “sounds like ADP’s DAP program” and I mentioned the Integration Points ADP has defined and I asked if they had a similar deal. He said “yes”.

So then I said, “well I really would love to have an easier time of it than using ERAccess and QueryBuilder so how much would that be?” Of course he wouldn’t say, but he did say that I, as the developer, would be assessed a one-time up-front certification fee and a per dealer installation and monthly fee. I said “what’s the certification fee, about $10,000?” He said “that would be the bottom end, it most likely would be more.”

I told him I was just a little guy developing important custom solutions for Auto Dealers. He said use ERAccess and Query Builder. I yawned.

Interestingly though, he did say that there were 3rd party vendors writing back to the DMS without Reynolds Certification (i.e. hacking) and that surprised me. I also learned that Reynolds is still using the PICK DBMS which is now called IBM UniVerse. Sheepishly he confided that UniVerse was state of the art in 1984 and I said “that’s alright ADP uses the same thing!.”

Until next time….

InfinitiNet Data Access

Infiniti USA requires its dealers to use a certain software in managing their dealerships. It is called InfinitiNet. I haven’t worked with an Infiniti Dealer in a few years so they may have made some changes to the name, description and platform but I don’t think so.

InifinitiNet is an IBM AS400 based system which is good news for us data extraction and mining people. With the right connection string and Data Provider, we can access using Excel (Microsoft Query), Excel (VBA), PowerShell, ADO.Net etc

The file structure is fairly intuitive and is easily explored with IBM’s ISeries Navigator.

I have maps of many of the more commonly accessed tables in the InfinitiNet database if anyone is interested.

Here’s a sample SQL statement pulling Warranty RO information:

SELECT “WAADIV”,”WAACUS”,”WAAVEH”,”WAASCI”,”WAAAWC”,”WAAAWO”,”WAASTS” FROM I0996.FILEZ.WAA WAA

As you can see the WAA table holds most of what we need to know about warranty ROs. I wrote this SQL statement to help a dealership get a handle on their warranty claim metrics that Infiniti uses to determine if a dealer is a candidate for a warranty audit.

Here’s a sample list of the tables in the FILEZ database:

ADM: Name & Address Master File
ADS: Email Addresses
ANA: Null
APP: Service Related Info
ASROUT1: Some Type of System Table
ASU: Null
AUD: Null
BAB: Only 2 records maybe accts??
BAR: Has a lot of records BAR Date and BAR UID, BAR Name
BAS: Looks like Customer ID and Some EIN#s
BAT: Looks like a batch ID file
BHD: Null
DCBDY: Holds avail car body codes and their desc
DCCLR: Holds colors
DCINT: Holds interior colors
DCMAK: Holds car makes
DCMDC: Holds car model codes?
DCMDL: Holds car model names
DEA: Holds deals! – a zillion fields
DRA: Holds Chart of Accounts! Or might be a report file showing all the accounts going into a report
FFS: Think this is financial statement – holds lines and pages
FII: F&I Inventory Master Interface Download File
GAA: Accounting Header
GCA: Accounting Detail
GAH: General Ledger History File Summary (Just like Saturn’s GLMSTHPF)
GAM: General Ledger Accounts Master File
IAM: Parts Master File
VAM: Vehicle Master File
VRM: Vehicle Owner File
WAA: RO Header
WJA: RO Jobs
WJM: Job Master File
WPA: RO Parts
WEA: RO Expenses
WAS: RO Split Coverages
WSH: Service History Header
WDH: Service History Detail
WAP: RO Parts Expenses Totals
WAT: RO Totals
ESH: CRM Sales History
GAS: GL Schedule Reports
GAT: Period Date Table
GBA: GL Document Addt’l Info
GBT: Document Type Table
GCT: Table Title File
USM: User ID Master
UST: User ID Access Table
VAT: Vehicle Tables

If anyone has a particular issue they need data on, leave a comment and I will try to help.