More on OpsMgr and I/O

We received the following email from "Jay", but his communication preference settings do not allow a reply to his email. Jay said:

I’m attempting to build a system that can host 10-20 concurrent Operation Consoles.  I already plan to get dedicated fast drives (10K SAS), and have separate drives (or "spindles") for each of: OS/apps/swap; DB data; DB logs; DB backup.

My deliberation is whether to have the RMS, Reporting/SRS (not the data warehouse DB), and operational DB/DBS all on one server with 4 dual-core Opterons and 64 GB RAM (OS, DBS, and SCOM, all 64-bit)  OR

Put the operational DB/DBS on it’s own server  — still with 4 dual-core Opterons and 64 GB RAM, and have the RMS and reporting on it own server with 4 dual-core Opterons and 16 GB RAM  OR

go with 3 servers and put the reporting/SRS on it’s own server.

Microsoft says to "scale up" or "scale out", but my instinct says "scale up" may be better for reducing latency for an interactive app like the console, as long as you have "enough" CPUs and RAM.

With around 1000 managed servers, or operational DB is about 50 GB, so the large amount of memory is intended for caching the DB.  With the DB "heavily" cached, and the DB/DBS on its own server, even with GB network, I would think the network latency, including the OS network stack latency would be significant.

With the one server solution, the "thread execution delay" may be significant, as even 8 processors may not be enough to keep all of the functions: RMS, reporting, SRS, DBS, running without "latency".

What I don’t know, is what is more significant —  the network latency or the "thread execution delay" latency.

I could also build the single server with 4 quad-cores to cut down the "thread execution delay" latency, but the quad-cores are still a bit pricey.

Salient points:

  • The question is where to position the database servers. There are three scenarios (if we read this correctly):
    • Have the RMS and SRS on one server with the operational database and data warehouse on a second server
    • Use three servers – RMS and SRS on one server, Operations database on a second server, data warehouse on the third server, Operations database on the third server
    • Use three servers – RMS on on server, Operations database on second server, data warehouse and SRS on  on a third server
  • He has 1000 managed servers, and a 50GB Operational database.
  • The plan is to have 10-20 concurrent Operations Consoles open.

Thoughts:

We vote for option number 3. With 1000 managed servers, it is definitely best to have the RMS all by itself. This also separates the Operational database and the data warehouse, which are updated simultaneously by the management server(s).

Jay did not mention the number of planned management servers. With 1000 agents, you would want to install multiple management servers. A rule of thumb is once you have at least 100 managed nodes, you will want to install a second management server. (A second management server is always a good idea in case the RMS goes down and you need to promote another server to that role.) While Microsoft supports up to 2000 agents per management server, you may want to add additional management servers, depending on the type of data being collected. For example, a single management server is unlikely to be capable of supporting 2000 Exchange servers due to the particularly heavy load these agents place on it. After the first two management servers, you may want to add an additional management server for every increase of 250-500 nodes.

As the number of active consoles grows, the database load also grows. This is because consoles, either operator or web-based, increase the number of database queries on both the operations and data warehouse databases. Having that many consoles is another reason to separate the two databases to different servers. Console performance improves quite a bit with Service Pack 1, which will be released shortly.

In addition, 50GB is fairly big for the Operations database. You may want to tune the grooming settings to get the database down to 40GB or even perhaps 30GB. While the database size officially has no limits, a number of OpsMgr sites (including Microsoft), have suggested a database of 40GB or below.

A good article on network bandwidth is one by Satya Vel at http://blogs.technet.com/momteam/archive/2007/10/22/network-bandwidth-utilization-for-the-various-opsmgr-2007-roles.aspx

In addition, you may want to look at Chapter 4 of System Center Operations Manager 2007 Unleashed, which discusses planning your OpsMgr deployment.

Anyone else have comments or suggestions for Jay?

Advertisements
This entry was posted in Tuning and Configuration. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s