Saturday, February 25, 2012

Madness?!? This is SQL!

I'm charged with building a web service that accepts data from
multiple locations and stores it to be queried by users. The data we
receive will vary from business unit to business unit and we intend to
add more units as the project progresses. Each unit's system we add
will return data in a different format, and I really don't want to try
and replicate all of their databases.
Someone tell me whether the way I have this planned is a good way of
doing it, or if it's pure madness.
Lets say I get data from 2 different business units as follows:
Unit 1 Data:
<employees>
<employee id="1">
<name>Doe, Jane</name>
<phone>999-999-9999</phone>
</employee>
<employee>
<name>Doe, John</name>
<phone>888-888-8888</phone>
</employee>
</employees>
Unit 2 Data:
<systems>
<system id="123" name="wrk001">
<ip>1.1.1.1</ip>
<os>WindowsXP</os>
<location>D105</location>
</system>
<system id="234" name="wrk002">
<ip>1.1.1.2</ip>
<os>WindowsXP</os>
<location>D106</location>
</system>
</systems>
When a user needs to look up data from unit 1, they'll need to be able
to supply the employee's name and or id, while users searching unit 2
will need to know an ip address and or location name.
First, lets say I have a System Table:
TBL_System
| System_ID | System_Name
+--+--
| 1 | Business Unit 1
+--+--
| 2 | Business Unit 2
+--+--
Then, I create a table that explains what keys users will be searching
for. These keys will be based off of the XML document structure.
| System_ID | System_Key | Key_Name
+--+--+--
| 1 | 1 | id
+--+--+--
| 1 | 2 | name
+--+--+--
| 2 | 1 | location
+--+--+--
| 2 | 2 | ip
+--+--+--
Then, lets say I create a Table called TBL_DataStore to hold this
info.
TBL_DataStore
| DS_ID | DS_Data
+--+--
| 1 | <employee id="z1">
| | <name>Doe, Jane</name>
| |<phone>999-999-9999</phone>
|| </employee>
+--+--
| 2 | <employee id="z2">
| | <name>Doe, John</name>
| | <phone>888-888-8888</phone>
| | </employee>
+--+--
| 3 | <system id="123" name="wrk001">
| | <ip>1.1.1.1</ip>
| | <os>WindowsXP</os>
| | <location>D105</location>
| | </system>
+--+--
| 4 | <system id="234" name="wrk002">
| | <ip>1.1.1.2</ip>
| | <os>WindowsXP</os>
| | <location>D106</location>
| | </system>
+--+--
Finally, I create a Lookup table that holds key values that the user
will be searching for
TBL_Lookup
| System_ID | System_Key | DS_ID | Key_Value
+--+--+--+--
| 1 | 1 | 1 | z1
+--+--+--+--
| 1 | 2 | 1 | Doe, Jane
+--+--+--+--
| 1 | 1 | 2 | z2
+--+--+--+--
| 1 | 2 | 2 | Doe, John
+--+--+--+--
| 2 | 1 | 3 | D105
+--+--+--+--
| 2 | 2 | 3 | 1.1.1.1
+--+--+--+--
| 2 | 1 | 4 | D106
+--+--+--+--
| 2 | 2 | 4 | 1.1.1.2
+--+--+--+--
Now, based on this structure, I can load data from any business unit
without having to change the data structure to add more businesses.
Further, In order to query the data, I don't have to know the specific
xPath of a piece of information.
Is this a good way to accomplish what I'm trying to achieve, or should
I figure out some way to use OpenXML() queries based on XML supplied
by the different business units. My concern about OpenXML() is it's
speed and flexability.
Hello Kris,
What volumes are you dealing with?
My view is the effort in creating a table far out ways the effort to support
a system like the one you are suggesting.
One simple view is to use full text, which allows full text to filter down
in a rough manner and then filter further using normal search predicates.
i.e. system and/or an xpath query
Simon Sabin
SQL Server MVP
http://sqlblogcasts.com/blogs/simons

> I'm charged with building a web service that accepts data from
> multiple locations and stores it to be queried by users. The data we
> receive will vary from business unit to business unit and we intend to
> add more units as the project progresses. Each unit's system we add
> will return data in a different format, and I really don't want to try
> and replicate all of their databases.
> Someone tell me whether the way I have this planned is a good way of
> doing it, or if it's pure madness.
> Lets say I get data from 2 different business units as follows:
> Unit 1 Data:
> <employees>
> <employee id="1">
> <name>Doe, Jane</name>
> <phone>999-999-9999</phone>
> </employee>
> <employee>
> <name>Doe, John</name>
> <phone>888-888-8888</phone>
> </employee>
> </employees>
> Unit 2 Data:
> <systems>
> <system id="123" name="wrk001">
> <ip>1.1.1.1</ip>
> <os>WindowsXP</os>
> <location>D105</location>
> </system>
> <system id="234" name="wrk002">
> <ip>1.1.1.2</ip>
> <os>WindowsXP</os>
> <location>D106</location>
> </system>
> </systems>
> When a user needs to look up data from unit 1, they'll need to be able
> to supply the employee's name and or id, while users searching unit 2
> will need to know an ip address and or location name.
> First, lets say I have a System Table:
> TBL_System
> | System_ID | System_Name
> +--+--
> | 1 | Business Unit 1
> +--+--
> | 2 | Business Unit 2
> +--+--
> Then, I create a table that explains what keys users will be searching
> for. These keys will be based off of the XML document structure.
> | System_ID | System_Key | Key_Name
> +--+--+--
> | 1 | 1 | id
> +--+--+--
> | 1 | 2 | name
> +--+--+--
> | 2 | 1 | location
> +--+--+--
> | 2 | 2 | ip
> +--+--+--
> Then, lets say I create a Table called TBL_DataStore to hold this
> info.
> TBL_DataStore
> | DS_ID | DS_Data
> +--+--
> | 1 | <employee id="z1">
> | | <name>Doe, Jane</name>
> | |<phone>999-999-9999</phone>
> || </employee>
> +--+--
> | 2 | <employee id="z2">
> | | <name>Doe, John</name>
> | | <phone>888-888-8888</phone>
> | | </employee>
> +--+--
> | 3 | <system id="123" name="wrk001">
> | | <ip>1.1.1.1</ip>
> | | <os>WindowsXP</os>
> | | <location>D105</location>
> | | </system>
> +--+--
> | 4 | <system id="234" name="wrk002">
> | | <ip>1.1.1.2</ip>
> | | <os>WindowsXP</os>
> | | <location>D106</location>
> | | </system>
> +--+--
> Finally, I create a Lookup table that holds key values that the user
> will be searching for
> TBL_Lookup
> | System_ID | System_Key | DS_ID | Key_Value
> +--+--+--+--
> | 1 | 1 | 1 | z1
> +--+--+--+--
> | 1 | 2 | 1 | Doe, Jane
> +--+--+--+--
> | 1 | 1 | 2 | z2
> +--+--+--+--
> | 1 | 2 | 2 | Doe, John
> +--+--+--+--
> | 2 | 1 | 3 | D105
> +--+--+--+--
> | 2 | 2 | 3 | 1.1.1.1
> +--+--+--+--
> | 2 | 1 | 4 | D106
> +--+--+--+--
> | 2 | 2 | 4 | 1.1.1.2
> +--+--+--+--
> Now, based on this structure, I can load data from any business unit
> without having to change the data structure to add more businesses.
> Further, In order to query the data, I don't have to know the specific
> xPath of a piece of information.
> Is this a good way to accomplish what I'm trying to achieve, or should
> I figure out some way to use OpenXML() queries based on XML supplied
> by the different business units. My concern about OpenXML() is it's
> speed and flexability.
>

No comments:

Post a Comment