OpsMgr by Example: The AD Management Pack

This blog entry is the next in a series of Operations Manager-related items which review the steps performed to install, configure and tune management packs in real-world environments.
Installation:
  1. Download the Active Directory Management Pack (http://www.microsoft.com/downloads/details.aspx?FamilyId=008F58A6-DC67-4E59-95C6-D7C7C34A1447&displaylang=en), and the Active Directory Management Pack Guide (http://www.microsoft.com/downloads/details.aspx?FamilyID=4b945737-e77f-4851-a11c-c4f79c36c360&DisplayLang=en).
  2. Read the Management Pack guide – cover to cover. There are important pieces you need to know that this document spells out in detail.
  3. Import the AD Management Pack (either using the Operations console or PowerShell).
  4. Deploy the OpsMgr agent to all Domain Controllers. The agent must be deployed to all Domain Controllers. Agentless configurations will NOT work for the AD Management Pack.
  5. Get a list of all domain controllers from the Operations console. In the Authoring node, navigate to Authoring -> Groups -> Domain Controllers. Right-click on the group(s) and select View Group Members.
  6. Enable Agent Proxy configuration on all Domain Controllers identified from the groups. This is in the Administration node under Administration -> Device Management -> Agent Managed. Right-click on each domain controller, select Properties, then the Security tab, and check the box to “Allow this agent to act as a proxy and discover managed objects on other computers.” This has to be done for EVERY DOMAIN CONTROLLER (DC), even if the DC is added after your initial configuration of OpsMgr.
  7. Configure the Replication Account under Administration -> Security (full details for this are in the AD MP Guide). This also has to be done for every domain controller, even if a DC is added after your initial OpsMgr configuration.
  8. Validate the existence of the “MOMLatencyMonitors” container. Within this container there should be sub-folders created for each DC, and having the name of each domain controller. If the container does not exist, it is often due to insufficient permissions. (See configuring the replication account within the AD MP Guide for details.)
  9. Open the Operations Console. Go to the Monitoring node and navigate to Monitoring -> Microsoft Windows Active Directory -> Topology Views. You may have to set the scope to the AD Domain Controllers Group to get these views to populate.
  10. Check to make sure that Active Directory shows up under Monitoring -> Distributed Applications as a distributed application which is in the Healthy, Warning or Critical state. If it is in the “Not Monitored” state, check for domain controllers which are not installed or are in a “gray” state.

Tuning/Alerts to Look for: The following are alerts we encountered resolved while tuning of the Active Directory Management pack.

Alert: AD Replication Monitoring – Access denied

Issue: This occurred on one domain controller and there was also an alert stating that it failed to create the MOMLatencyMonitors container. Validated the container by logging into the domain controller, opening up AD Users and Computers, View/Advanced Features, and seeing that the container (and the two existing domain controllers as sub-containers) did exist, per the following screenshot.

 

Resolution: Already resolved as the MSAA had the permissions required to create this container. Validated the MOMLatencyMonitors container existed and that container included sub-folders matching the name of each domain controller. (If the container does not exist, it is often due to insufficient permissions; see configuring the replication account within the AD MP Guide for configuration information.)

Alert: Script or executable failed to run

Issue: On the domain controllers, failure on ADLocalDiscoveryDC.vbs on each domain controller prior to SP1 in OpsMgr.

Resolution: Looking at this thread on the Microsoft TechNet website, http://forums.microsoft.com/technet/showpost.aspx?postid=1628491&siteid=17&sb=0&d=1&at=7&ft=11&tf=0&pageid=1 this appears to be a pre-SP1 issue, so we disabled the rule until SP1 releases. To disable, navigate to Authoring -> Management Pack Objects -> Object Discoveries and perform a Find on “AD DC Local Discovery.” You may have two of these (Windows 2000 Server, Windows Server 2003), depending on what versions of the management pack were imported into your management group. Create an override to disable both rules for all objects of “Windows Domain Controller.” Remove these overrides when you implement Service Pack 1 for OpsMgr 2007.

Alert: The Op Master PDC Last Bind latency is above the configured threshold

Issue: Bind from the domain controller identified in the alert to the PDC emulator is slower than 5 seconds for a warning and slower than 15 seconds for an error. This occurred in a remote site connecting to a central site with the PDC emulator role.

Resolution: The alert appears to be due to a slowness in the link between the two locations, or a condition where one of the two servers identified may have been overloaded. In this particular case it was caused by a domain controller which was overloaded due to insufficient hardware, which had to be decommissioned.

Alert: Session setup failed because no trust account exists : Script – AD Validate Server Trust Event

Issue: Specific computer accounts were identified multiple times as not containing a trust account

Resolution: This is caused by either systems which believe that they are part of the domain but no longer are, or often by systems that are being imaged. Resolution of this is either to drop and rejoin the system to the domain or to close the alert if the system is no longer online.

Alert: KCC cannot compute a replication path

Issue: KCC detected problems on multiple domain controllers

Resolution: Connectivity was lost from the central site to a remote site for a period of several hours. The remote site was down due to a power outage. Errors were logged every 15 minutes from when it was down until when the site was back online. This also occurred when a domain controller had been shut off but still existed from the perspective of Active Directory. This can also occur in environments where the site topology is set to automatically generate the site links but the network is configured so that some sites cannot see other sites. (As an example, in a configuration with a hub in Dallas and sites in Frisco and Plano, where both sites can see Dallas but cannot see each other.)

Alert: A problem was detected with the trust relationship between two domains

Issue: The domain controllers could not connect to the domain controller in the other domain. This was due to a routing issue between the specific domain controllers and the domain controller in the remote domain. Remote sites were connected via VPN and could not route to that subnet.

Resolution: Provided routing from the domain controllers to the domain controller in the other domain.

Alert: AD Replication is slower than the configured threshold

Intersite Expected Max Latency (min) default 15

Intrasite Expected Max Latency (min) default 5.

Issue: This alert will also occur if connectivity is lost between sites for a long enough period of time.

Resolution: If the alert is not current and not repeating and if replication is occurring and the Repadmin Replsum task comes up clean, this alert can be noted (to see if there is a consistent day of week or time that it occurs at) and closed. We added a diagnostic to the AD Replication Monitoring monitor, for the critical state, taking the information from the REPADMIN Replsum task which provided (You must have the admin utilities installed on the DC for this to work):

<Configuration>

<ApplicationName>REPADMIN.EXE</ApplicationName>

<SupportToolsInstallDir>%ProgramFiles%\Support Tools\</SupportToolsInstallDir>

<CommandLine>/replsum</CommandLine>

<TimeoutSeconds>1200</TimeoutSeconds>

</Configuration>

We created the diagnostic to run automatically using:

Program: REPADMIN.EXE

Working Directory: %ProgramFiles%\Support Tools

Parameters: /replsum

Options available included changing the replication topology to replicate every 15 minutes, or configuring overrides. To resolve, we tried creating a custom group for the servers in the location (see the “Creating Computer Groups based on AD Site in OpsMgr” blog entry on http://Cameronfuller.spaces.live.com for additional information) and created an override for the new group changing the Intersite Expected Max Latency to 120 (so it would be double the configuration in AD Sites and Services). We performed this configuration for each remote location which did not have a 15 minute replication interval. This could also be done for all domain controllers using the domain controller computer group(s). This did not function as expected but is being used as an example for how overrides can be creatively configured, in this case based upon sites!

Alert: AD Replication is slower than the configured threshold

Intersite Expected Max Latency (min) default 15

Intrasite Expected Max Latency (min) default 5.

Issue: The remote location replication topology was defined to be 60 minutes, not the standard of 15.

Resolution: At this point in time there is no good workaround to change these configurations and maintain a Microsoft-supported configuration after the change is made. There are discussions in the newsgroups about changing these through exporting the MP, changing the XML and re-importing it as unsealed but Microsoft will not support the AD MP if it is changed in this way. The recommendation right now is if your environment does not use the 15 minute latency to disable both this alert, and the “AD Replication is occurring slowly” alert.

Alert: AD Replication is occurring slowly

Issue: Same as identified in alert “AD Replication is slower than the configured threshold”. This rule does not provide the ability to override the default configuration of 15 minutes. The AD environment is not configured with the default of 15 minutes so these rules do not apply as they are still replicating within a successful timeframe.

Resolution: Disabled this rule (AD Replication is occurring slowly) for group “AD Domain Controller Group (Windows 2003 Server)”. This could also be done for individual servers if there were a limited number of these where the AD replication was not configured with default replication times of 15 minutes. Closed the alerts.

Alert: Script Based Test Failed to Complete

Issue: AD Database and Log : The script ‘AD Database and Log’ failed to create object ‘McActiveDir.ActiveDirectory’. The error returned was: ‘ActiveX component can’t create object’ (0x1AD)

Resolution: Uninstalled OOMADS using Add/Remove programs, Active Directory Management Pack Helper Object (the original version was .05 in size) and re-installed the 64 bit equivalent which was AMD64 in this case. To do this we had to copy the MSI locally to the system to install it, after installation it was .07 in size within Add/Remove programs.

Tuning: Other Issues

Issue: Domain controllers in the DMZ would not install even though they are in a domain within the forest.

Resolution: Copied over the files and manually installed the agents. Opened up port 5723 on the firewall between these systems and the OpsMgr server. Removed the port 1270 which had been used for MOM 2005. (This issue should only occur if you previously used MOM 2005.)

Issue: One DC showing extremely high CPU usage/cscript errors.

Resolution: The server was running with 256 MB of memory, and was using significantly more than that even before the OpsMgr agent was deployed to it. Once the agent was deployed, memory usage went significantly higher and resulted in cscript errors which timed out due to the slowness of alerts.

Alert: One or more domain controllers may not be replicating.

Issue: The AD MP will report replication issues across all DC’s if only one was down (and thus not able to replicate its monitor objects).

Resolution: Get all domain controllers monitored by OpsMgr. Validate replication in the environment.

Tuning concept: Weekly close out any alerts greater than 5 days which have not been resolved if they represent issues which may have self-resolved.

Alert: Script or executable failed to run.

Issue: On the domain controllers, failure on ADLocalDiscoverDC.vbs on each domain controller prior to OpsMgr 2007 SP1.

Resolution: Looking at the http://forums.microsoft.com/technet/showpost.aspx?postid=1628491&siteid=17&sb=0&d=1&at=7&ft=11&tf=0&pageid=1 thread on the Microsoft TechNet website, this appears to be a pre-SP1 issue, so we disabled the rule until SP1 releases. To disable, navigate to Authoring -> Management Pack Objects -> Object Discoveries, and perform a Find on "AD DC Local Discovery". You may have two of these rules (Windows 2000 Server, Windows Server 2003), depending on the versions of the management pack that were imported into your management group. Create an override to disable both rules for all objects of "Windows Domain Controller". Remove these overrides when you implement Service Pack 1 for OpsMgr 2007.

Problem: We can’t disable this until ALL domain controllers are integrated into OpsMgr. If the rule is disabled before the domain controllers are added, they will never get added.

 

Additional Thoughts: Install the support tools on the domain controllers so you can take advantage of the tasks and use the tools as part of the diagnostics and recoveries.

Advertisements
This entry was posted in Tuning and Configuration. Bookmark the permalink.

12 Responses to OpsMgr by Example: The AD Management Pack

  1. william says:

    Nice article!
     
    I don´t understand very well the part of setting up the replication account. The documentation from the Management Pack states that the user should be "Member of the local user group", "Member of the local performance Monitor user group"….  As far as I know, these groups don\’t exist in a DC ??¿?¿?…  How did you do this?

  2. Operations says:

    Hi William,
     
    If you run Active Directory Users and Computers and look at the Builtin Groups, you will see there is a Performance Monitors Users group as well as a Users group. These are Domain Local groups – local to every domain controller in that domain (that\’s the lowest level of granularity you get on a DC).
    Hope that helps …  

  3. william says:

    Thanks for your help!
     
    I can see it  bit clearer now…I wish it was explained a bit better in the MP guide.
     
    The other amazing thing, is *(from my perspective) how overcomplicated the way of granting event log access to the replication account.
     
    1/ The security template must be tweaked for this particular security setting to be seen (see http://support.microsoft.com/kb/323076)
     
    2/ The access must be granted in Security Descriptor Definition Language (SDDL) sintax
     
    Really not nice stuff to ask the customer :(… 

  4. Operations says:

    We discovered this from Robin Drake in the newsgroups, although we haven\’t verified it yet:
    (nntp://msnews.microsoft.com/microsoft.public.opsmgr.ad/<#UKJgh97HHA.1208@TK2MSFTNGP03.phx.gbl)
     
    AD replication monitoring overrides have to be identically set in 5 different workflows.  Leave any out and you will continue to see alerts as if for the old setting:Active Directory Domain Controller Server 2003 2003 Computer RoleRulesAD Replication Performance Collection – Metric Replication Latency: AverageAD Replication Performance Collection – Metric Replication Latency: MaximumAD Replication Performance Collection – Metric Replication Latency: MinimumAD Replication Performance Collection – Metric Replication LatencyMonitorEntity Health\\Availability\\AD Replication Monitoring

  5. Todd says:

    Hello,
     
    Is setting up the replication account and latency containers a requirement to be able to see any AD replication related alerts at all?  Or is it only required for seeing latency related alerts?  We are having problems setting this account and associated configurations up in our environment and want to know what replication events we\’ll see without it.
    – Bob Sweeney.

  6. Operations says:

    Hi Bob,
     
    According to the ADMP Guide, if you want to monitor replication, you must configure an account that will be used for the monitoring. Without the replication account and latency containers, OpsMgr will not generate AD replication alerts. In addition, the latency report is disabled by default due to the high volume of data required to generate the report.

  7. Stefan says:

    Hi Kerrie and Cameron,
     
    I’m also struckling with the account for replication monitoring ;-( Because I’m monitoring two domains with no trust between the domain where the RMS is in and the other domain. The monitoring is done through a Gateway Server. But how would I configure the Replication Monitoring account? Do I need to create two Replication Monitoring accounts? But that will probably not work because that other Rep Mon Account doesnot have any rights on the domain where the RMS is in.
     
    Regards,Stefan Stranger

  8. Operations says:

    Stefan,
    Hate to break the news, but it looks like you have encountered the big limitation to the ADMP in OpsMgr 2007; it only monitors the home domain of the management group. There is only one ADMP account with this description:
     
         This account is assigned to the AD MP Replication Monitoring rules. Please see the AD MP Guide on details how an [sic] the AD MP account needs to be configured.

    Page 10 of the AD MP Guide for OM 07 says you create a Run-As account in the core OpsMgr domain, and associate the AD MP Account to all DCs in the environment.  Unsaid but implicit is all those DCs that are in your AD domain/forest that trust the AD MP Account.
    This fits with previous confirmations from Microsoft that the AD Topology Root distributed application only works in the home domain of the OpsMgr management group. Other domains reached through gateway servers simply can’t benefit from AD Replication Monitoring in OM 07 V1 (applies to the Remote Operations Manager model as well).

  9. Operations says:

    Just an update to our comment of 9-7: We have been able to verify the posting from
    nntp://msnews.microsoft.com/microsoft.public.opsmgr.ad/<#UKJgh97HHA.1208@TK2MSFTNGP03.phx.gbl:
     
    AD replication monitoring overrides have to be identically set in 5 different workflows.  Leave any out and you will continue to see alerts as if for the old setting:Active Directory Domain Controller Server 2003 2003 Computer RoleRulesAD Replication Performance Collection – Metric Replication Latency: AverageAD Replication Performance Collection – Metric Replication Latency: MaximumAD Replication Performance Collection – Metric Replication Latency: MinimumAD Replication Performance Collection – Metric Replication LatencyMonitorEntity Health\\Availability\\AD Replication Monitoring
     

  10. Michael says:

    How would you set up the overrides for a replication-lag site?  I set the overrides for all 5 workflows for my DC in the Repl-Lag site and it seemed to stop the Warnings for "One or More Domain Controllers May Not be Replicating".  However when I upgraded to OpsMgr SP1 the Warnings came back for all my other DC\’s that they can not replicate to the Repl-Lag site.

  11. Operations says:

    Hi Michael, this sounds like a SP1 bug. Check the SP1 newsgroup for information regarding bugs related to the service pack.

  12. SANTOSH says:

    This is a great post. Our RMS is in one domain and we are monitoring couple other domains with 2 way trust. Our action account is a domain admin account. I think the best option for us would be to add our action account from the RMS domain to AD MP account. I can create run as account for our action account. What do you guys think? How about other domain? I have 2 other domains where agent opsmgr service is running under local system. I don\’t see a clear documentation on how the permission should be setup. RMS action account has no rights on other 2 way trusted domains. Is this the right configuration? By the way I love the unleashed book. Very informative. Thanks. 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s