Saturday, October 14, 2017

Interconnectivity

[Tech]
So I'm on this project to move a (big!!) database to a new home. Upgrade the hardware and RDBMS too, but mainly split up into a more logically planned set of databases (plural), instead of schemas all within one database.
Sounds easy, doesn't it - especially to data-ninjas like all you out there reading this.
Yeah, well, not so much.
The thing that everyone looks at is size - if it's huge (tens of terabytes or more, say) then it must be difficult.
But that's not really so. It's kinda like logistics - once you've done it once then you know how, so doing it again is easy ... it's the organising of doing in 10 billion times at the same time that takes skill.
In this case it isn't the size of the data that makes things hard - it would be as hard with 1 MB or 1 GB or 1 TB or 1 EB of data - but it's the interconnectivity between objects that's the killer.
Not when you move something in a lump - that's when the size becomes a pain - but when you move things relative to each other - that's when things get really awkward!
You see, if you have two things sitting next to each other, so to speak, is (let's say) Schema_a, then one can refer to the other directly by just the name - like this:
select Field_a, Field_b from Table_a
If you move Table_a to another Schema (Schema_b, say), then you need to say
select Field_a, Field_b from Schema_b.Table_a
to distinguish it from the version that stayed home, back in Schema_a. Then you take it to another database in the same instance
select Field_a, Field_b from Database_b.Schema_b_Table_a
and then move it off to another server .....
select Field_a, Field_b from Server_b.Database_b.Schema_b_Table_a

All this seems very simple - and it is. However, there's a certain amount of work involved to change from the first to the fourth example:
1. Find where the external reference is
2. Open the code for editing
3. Edit the code (correctly!)
4. Save and close the code (compiling it as needed)
Which also is simple, albeit taking a little longer than just realising that it needs to be done.

In fact, the difficult pieces are not the ones in that list above, but these:
1. Locating all the places where the code will have to be changed, missing none.
2. Doing it all within the inevitably unrealistic time-constraints imposed on you by people who agreed to the time scheduling without knowing what had to be done.

So the first thing you do is write some code. This code finds every piece of code that refers to any of the objects that's going to move. That could easily be just a few, but if you're working with a database that's grown slowly over the last ten years or so, it's quite possible to number in the thousands.

Then the number of connections ........... what you would call the interconnectivity. Just how many of the queries in your code depend on information that's suddenly going to be in a new place? Will it be hundreds (lucky you!), or will it be tens of thousands (unlucky you!).

Either way, upping your Courage Vitamin might be good .

Good luck!





No comments:

Found Food

I have published quite a few recipes here on my blog over the last few years, and I hope that all my readers have tried at least some of the...