SSIS and SFTP Task/XML Destination


So far Ive found some serious lack of tools I need in SSIS. Don’t get me wrong, I LOVE SSIS. So we’ve had to either write in house custom components or purchase them from third parties.

The two big ones are SFTP Task and XML Destination.

Here is what I could find form Microsoft about the XML Destination:
http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=303815
I need to output some seriously complex XML… writing this component in house was the only option.

The other Component was a SFTP Task, now FTP is available, but I needed SFTP, my coworkers did a lot of research and came up with a nice plug-in from  cozyroc.com to handle this.



That whole sql migration process…


I wanted to blog about whatever I had learned in the past week but really what came to mind was not answers, but more questions. Questions are good, they have a side effect of helping me learn. So here you go…

Migration:

What is the ideal way to push changes from your development server, to stage, and then production SQL servers with Microsoft products?

The way I am familiar with is using a third party tool, Apex Diff, to script data/schema differences add scripts to VSS, then having a custom script for any data cleanups… and push.

But Microsoft has built this all into Team system I think, I’ve never used it personally.  

Also, versioning… say I make changes to a table, “alter table add column blah blah”, what is the best way for VSS to pick up this change? I’ve always checked out the table creation script, manually added my alter script to the bottom with comments about the workorder where the new column was requested by the client, then checked in the script, hit ”f5″ to push just that portion to my database and called it a day.

But is there a way for VSS to automagically pick up these changes, and who made them… it would save me so much time to just hit “f5″ and be done.

The same goes for SSIS. How do I know what changes were made to SSIS packages before I got to them? I suppose I could compare the XML files… but I need more, I need to know why the change was made and who did it…

I guess what I’m saying is, if I only had Microsoft tools to work with, what are the best practices for pushing these changes to my databases.

All these questions are things I’ll most likely be researching and blogging about in the next few weeks.

This seems like a vital part of having a database, code change and migration process per Microsoft should be clear as a bell right?



xp_fileexist


Sometimes you need to check to see if a file exists before you put the path in your database. Per local rumor this is an undocumented sproc in SQL Server. Love it, use it. 1 means yeah 0 means no. Try it out!

xp_fileexist ‘C:\test.txt’  from SSMS.



SQL Saturday!


http://www.sqlsaturday.com/

I would LOVE to put together a SQL Server Users Group in Knoxville. Even more I would love to eventually put together a SQL Saturday. (link above) if your interested send me a message.



Locking


Database locking issues arose this morning for me. When two updates or selects happen to the same object and the same record, record/page locking occurs. In my defense I was running a script for someone else, also I didn’t realize that there were two recurring jobs that started moments into running my script. I had the DB in restricted user mode. I only knew of one job that was going to start this morning but I should have been done with my work way before it ran.

Here is a simple debug script you can run to see if you have locking on your database and what spids are causing it: 

SELECT status, SPID, CPU, Physical_IO, WaitTime, Blocked, HostName, Program_Name, nt_username,Login_Time, Last_Batch, Open_Tran, waittype, lastwaittype
FROM master..sysprocesses with (nolock)
where blocked >0 or open_tran > 0

Also there is the SQL Server 2000 sp_lock command MSDN article about it’s results is here.

After a phone call or two I managed to disable the jobs and kill off any spids that were keeping my script from finishing. I then reenabled/ran them safely when I was done. The angels sang, the publish was complete, and had passed testing.

Lesson learned? Be more attentive to what jobs are running on the server, ugh. *smacks forehead* Gah.



database normalization, are you normal?


It seems that the question of the hour is always what my opinion is on Normalization. Or maybe they are asking me if I’m normal… hmm, I think I’ll not go there. ;)
What is normalization? Well, its the method in which you reduce redundancy in a database.

There are several industry standards on this, they have fancy schmancy names (not really). The standards I’m covering here were created by an Englishman named Edgar F. Codd.

His concepts go from the most basic form of Normalization 1st Normal Form(1NF), to the stricter form, 3rd normal form(3NF). (There are more however I’m not going to cover them here)

To get the gist of how this all works lets define our database:
Your database stores information on Mickey Mouse and his Balloon Race competition.
Mickey and his friends need three mouskatools to get Donald Duck through the race.
Our table displays which character used which tool.
Mickey and Donald used Stickers to fix the holes in the balloon.
Mickey used the baby elephant to weight down the balloon.
Pete used the rope swing to get unstuck from a tree.
Pluto tags along but doesn’t really do anything.

So you might be tempted to have a table that looks like this.

Character_ID Name AnimalType Tool
1 Mickey mouse stickers, baby elephant
2 Pete dog rope ladder
3 Donald duck stickers
4 Pluto dog  

Lets talk about 1st Normal Form(1NF):
The rule of 1st normal form says that you shouldn’t have any data that repeats within one cell (above). Nor should you, in this scenario, define multiple columns that hold the same type of data(Tool1,Tool2). Basically we are removing redundancy from left to right (horozontally)

Character_ID Name AnimalType Tool1 Tool2
1 Mickey mouse stickers baby elephant
2 Pete dog rope ladder  
3 Donald duck stickers  
4 Pluto dog    

In order for us to follow this Form we need to break the above data up into multiple tables with primary keys.

character_id Name animaltype
1 Mickey Mouse mouse
2 Pete dog
3 Donald Duck duck
4 Pluto dog
character_id tool
2 rope ladder
1 stickers
1 baby elephant
3 stickers

Lets talk about 2nd Normal Form (2NF):
The rule of 2NF says that you must follow 1NF AND seperate out groups of data into their own tables.

character_id Name
1 Mickey Mouse
2 Pete
3 Donald Duck
4 Pluto
tool_id tool
a rope ladder
b stickers
c baby elephant
character_id tool_id
1 b
1 c
2 a
3 b

3rd normal form (3NF):
The rule of 3NF says that you must follow 2NF, AND remove redundancy from top to bottom (vertically).

character_id Name animal_id
1 Mickey Mouse 1
2 Pete 2
3 Donald Duck 3
4 Pluto 2
character_id tool
2 rope ladder
1 stickers
1 baby elephant
3 stickers
animal_id animaltype
1 mouse
2 dog
3 duck
   

So there you go, a basic basic idea of 1-3 Normal Form.



Need a new laptop


I am in the market for a new laptop, one that will be my primary machine and can be left on 24/7 if need be.

I want to be able to connect it to my dual monitors, it will replace my desktop. My current laptop is a really nice little Sony VIAO and it’s less than 5 pounds, Ive had it for a few years and it’s been good to me. I can run Virtual Machine on it but over time it get’s extremely hot. 

I’ve been thinking about getting a Mac Book Pro with 4GB of RAM. With VMWare I can run windows. But I’m not sure how SQL Server performs. I wouldn’t be using it for a server, just as a development platform. I don’t think it supports dual monitors. I also need to be able to remote into work via Cisco VPN.

One thing I dont like about laptops is the preinstalled software that comes with the machine. I just want Windows please…

Advice is welcome.



what’s faster?


select count(*)  or select count(1) ?

I think they are the same.



What is a clustered index anyway?


 Well I had assumed that since a “cluster” is two or more things squished together, that a “clustered index” was an index that was composed of two or more fields. 

Did you see that word? “Assumed”, as my fellow DBA Eric says “makes an ASS out if U and ME”…ass-u-me assume get it…haha…yeah okay. *cough* SO ANYWAY…
I was wrong to assume, imagine that. But a clustered index actually orders the physical data, stored in “pages”. My best guess at the reason WHY the word “clustered” is used is that it takes all the pages that the data is stored on, reorders them, and puts them together for faster read access. But that’s just my guess.

Here is an example:

create table MyTestTable
(
HouseNumber int,
StreetName varchar(8000)
)

insert into MyTestTable values (125, ‘Skyline’)
GO
insert into MyTestTable values (234, ‘Skyline’)
GO
insert into MyTestTable values (90, ‘Skyline’)
GO
insert into MyTestTable values (75, ‘Skyline’)
GO
insert into MyTestTable values (127, ‘Skyline’)
GO

–BEFORE CLUSTERED INDEX
select * from MyTestTable
125 Skyline
234 Skyline
90 Skyline
75 Skyline
127 Skyline

Now let’s add a clustered index to this table:

CREATE CLUSTERED INDEX MyTestTable_HouseNumber
ON MyTestTable (HouseNumber)
GO

–AFTER CLUSTERED INDEX
select * from MyTestTable
75 Skyline
90 Skyline
125 Skyline
127 Skyline
234 Skyline

 


 

 

 

Clustered indexes are better for queries using ranges of data. If a human being were told to get all books off the shelf whos author’s last names start with Ri-Ro if the books are in order you can grab them and come back. If they are out of order your going to have to go an index, look up where the books are located, visit each shelf and then grab the book requested. Or worse, if there is no index, you need to scan each book in the library taking the book you need down as you go. (in SQL they call this a table scan)