January 30, 2009

The Case Against (Public) Database Links

As promised, here's my rant on db links. Database links aren't as bad as synonyms. It is usually the implementation that is at fault. There are of course many reasons where database links are useful and make things possible that would not otherwise be. The problem is when they are created in a nonsecure way or to accomplish things in code that could be done in a better way.

I don't like them because they complicate security administration. They tend to be out of sight/out of mind after the initial creation. They can become a back door to things that were not intended. Whenever possible, I would prefer to use other methods.

There are two conceptual points that I think make it more clear. First we have the idea of a user or web application connecting to one database and accessing data over a public db link. The public db link subverts any ability we may have had in securing the remote database against the user or web app. Usually a public link is created with some purpose in mind, but its availability is exploited for other purposes and we soon lose track of what or who is using it.

The second is that the purpose of a db link is not to relieve us of the chore of having to log in separately to the remote database. Sometimes db links are created just so a user or app does not need to go to any trouble to authenticate a second session. If the db link is a public one, the temptation to use it that way is spread to any of the applications running on an instance.

And why can't a web-based app just connect directly to the remote database and do its work? The only reason is that the app may need to perform an operation that joins a local and a remote table. For that, a private db link with minimum permissions granted is sufficient. For everything else, it is actually more desireable to have separate sessions to each accessed database because it gives the developer more flexibility in trapping and handling errors. If one database is down, the parts of the app that use another may still function, for example. When things are done over a database link, any Oracle exceptions are all wrapped up in the exception stack for the local database. We get the sometimes ambiguous errors such as "end of file on communication channel."

For database jobs and PL/SQL procedures, db links are the only option for working with data in a remote database. That's fine. Just don't use a public link.

The really scary scenario is where permissions like "select any table" or "unlimited quota" are granted to a user that is connected to via a public db link in another database. I bring this one up because I discovered it as a lowly user at a previous job. I found a db link that gave me access to all of the tables in a remote database, and when I asked one of the administrators whether it would be okay to use that link to get some data for a project I was working on it set off a scramble behind the scenes to close it up. They didn't know the door was left open and they had no auditing in place to raise any flags.

The main point here is that a public database link can add to your administration workload. There are a plethora of security considerations to deal with that could simply be avoided by never allowing public database links. It is just not worth it. And limit your use of private ones too, making sure that proper planning has been undertaken. If developers complain about it, ask them if they'd like to be the ones held responsible for the security and integrity of an invaluable company asset--the data in the database.

January 28, 2009

Data Security

I knew that data theft was a big problem, but I didn't know how big. Then I came across this web site that has tabulated all of the known data breaches in the US. It is called the Privacy Rights Clearinghouse. It is an excellent resource on data security and privacy.

if you take a look at the tables of data breaches, you'll see a lot of them involve large data sets at public Universities. Some are due to lost or stolen laptops. Others are the result of hacking activities.

The hacking jobs remind me of one that occurred at UT Austin McCombs School of Business in 2006. The associate dean, Allison Davis-Blake, had just been named the dean of the Carlson School of Management here at the University of Minnesota, and I was at Carlson at the time. The breach was a pretty big deal, affecting 197,000 student records. What's not shown on the PRC table linked above is that it was an inside job. An employee with access downloaded the data with the intention of selling it.

The UT Austin case underscores an important point: we tend to believe that data hacking is only done by mean people in China and Uzbekistan, but often it comes from within. We can do everything to secure our systems to the outside world, but it means nothing if we do not have the proper controls and auditing in place internally. Nowhere is this more important than at a university, where there are thousands of students working on the same networks as the IT systems, where administrative offices are often housed in the same buildings as classrooms, where student data is replicated across systems in every college, and where there is no central control on data security.

The consequences of a data breach at a university can range from a PR nightmare to hefty fines. Student data is regulated under FERPA and medical data under HIPAA.

Unfortunately, people are content to put data security off until a later time. It is one of those IT services, like backups, that don't really deliver any value in return for the investment. I think there is a general understanding that security is important, but most people prefer to tempt fate and put security off in favor of more glamorous projects. Or they have immediate problems to deal with and can't afford to switch to security. I don't know what I can do other than to keep bringing it up.

I've seen the fallout from a couple of security breaches and it is not pretty. People look for someone else to blame, sometimes people lose their jobs, and every time there are a few administrators who put in a lot of extra hours to rebuild affected systems. It is something I hope I never have to deal with.

January 27, 2009

The Case Against Public Synonyms

I have been walking the 2.5 miles to and from work lately. The subzero temperatures have not affected my mind, because I find this to be a great time to gather my thoughts about work and life. So it was yesterday as I was walking home after a long day. I thought about all the things I have learned about public synonyms and database links that seem to outweigh most of the benefits. I'm going to start with public synonyms, and then as I find the time I'll post my tirade against database links.

This is not to say that public synonyms should not be used, but we tend to employ them with some expectation that they will make our lives easier. That is often not the case when you consider the big picture. They can create unintended problems down the road. I've made a list of the disadvantages of using them here. These are the things that should be weighed before creating public synonyms.

  • Degraded performance—This may be the most famous one. Consultant Steve Adams demonstrated it back in the 8i days. Tom Kyte has several posts on about it, including this one where he repeasts Adams' test. Public synonyms and to a lesser extent private ones create a performance problem in the database instance.
  • Difficulty in maintaining—While we intend for synonyms to make our lives easier, sometimes it can be more work to keep them up to date than to live without them. If an object is dropped, the synonym should be dropped too. If we create a new table, we probably should create the public synonym too. If we rename an object, we should rename the synonym too. And then we have source control. If we become dependent on synonyms, we need to make sure they get migrated through code deployments. Is it really worth the worry?
  • Clutter—This is something we can avoid if we think things through before creating synonyms. But usually we don't, and we end up with public synonyms hanging aroud for objects that no longer exist. Or we have synonyms that refer to renamed objects using the original name.
  • Ambiguity—This can happen with synonyms that refer to dropped objects and with multiple users creating private synonyms in addition to public ones that already exist. I gets messy. It becomes a chore trying to figure out which synonym or object name is taking precedence. Troubleshooting problems is a more cumbersome task.
  • Security—Sometimes it is a good idea not to expose the names of all of your database objects to every user on the system. If you create a public synonym, you're creating something that everyone can see regardless of whether they can execute it or query its contents. If you had a table named "PROJECTED_LAYOFFS_2009," would you want everyone to know? And before you discount the significance of this, think about some of the descriptive names you may have seen chosen for tables or views. And look at the quarterly Oracle Critical Patch Update notices. Sometimes there are exploits where simply knowing the name of an object and having a grant on some DBMS package is enough to get your foot in the door.

There are alternatives to public synonyms that we can use to the same benefit without some of the pitfalls:

  • Alter session set current_schema='target_user'—No need to prefix every object name with the owner name. This is like telling Oracle to assume every object you refer to is owned by 'target_user' unless you specify otherwise.

  • User proxy authentication—In effect, users are authenticating with their own credentials but have authorization to operate within the database as though they are the proxy user. It is also a way to grant developers the ability to develop in a shared schema without having to share the password to that schema. I've had problems getting this to work in Toad, however. I'm fairly certain it is a bug in Toad, but the Quest support rep gave up and closed my support ticket.

  • Prefix everything with the owner!—Yes, this is what everyone is trying to avoid. I must admit, this is my preferred way because it is deliberate and unambiguous. I always know what object I am referring to and there is always explicit feedback that I am working on objects for a particular schema. Why hide schema objects beneath synonyms? If we're working within a database then we're presumably proficient with computers, so we should be fast enough at typing to get by this way without too much trouble.

Maybe the best way to summarize all of this is that we should consider public synonyms only as a last resort. There are alternatives that are better in most cases. At minimum, we should think problems through before creating that next public synonym with the expectation that it is making things easier.