+ Post New Thread
Page 1 of 4 1234 LastLast
Results 1 to 15 of 51
Windows Server 2008 R2 Thread, Considerations for rebuilding a DC? in Technical; So, the time has come to rebuild Controller1, one of two DCs, it had a software failure months ago so ...
  1. #1

    Join Date
    Mar 2010
    Location
    shadowx@AllEvil:/
    Posts
    222
    Thank Post
    12
    Thanked 28 Times in 25 Posts
    Rep Power
    13

    Considerations for rebuilding a DC?

    So, the time has come to rebuild Controller1, one of two DCs, it had a software failure months ago so we have been using our second DC to get us by until this summer.

    The situation is that Controller1 has random software glitches meaning it fails to log people on, fails to authenticate in general, fails to assign and check permissions correctly, fails to resolve DNS sometimes all mostly random. It sits in a failover/load balancing cluster with Controller2 so there is no primary or secondary, they both operate at the same level and sync data between them (when working...) so my basic plan is thus:

    Check Controller1 for any data we need to keep (local files for whatever reason)
    Make sure Controller2 has all the FSMO's (Thanks google!)
    Shutdown Controller1 for the last time then delete meta data from the AD on Controller2 that points to the now dead Controller1 (Again, google)
    Then start the software rebuild starting with server 2008 R2 and whacking all our services on there (Clustering, DNS, DHCP, AD-DC, Backup etc...) obviously keeping it isolated once I start on DNS/DHCP/AD-DC until I sync it with Controller2

    Anything I'm missing? We built the system last year so I have experience in setting up a windows network from scratch but this time I have to worry about not corrupting the existing AD data! We had Novell before this so I wasn't as worried about accidentally deleting/corrupting the data.

    Once it's rebuilt I will then force a replication from Controller2 to Controller1, do I need to worry about a reverse replication where C2 sees the empty AD database of C1 and says "Ah well I best delete all my data too then!"? I understand (I believe) that I can force a one way replication from a server to the current one (IE if I log on to the empty C1 and force it to replicate FROM C2) but what I don't want is some sort of automatic replication, that would be.... awkward....

    I can't use the backup of the server since from what I understand it would likely be older than the default tombstone value of the AD. The backups we have a full system state backups designed to be plastered on top of a new install of the OS in the event of catastrophic failure but I have a feeling using those backups would be just as much of a headache, plus they have a good chance of carrying whatever software glitch caused the problem in the first place.

    Last question, am I better to use the same name as before or wipe references to that name and call it something else like Controller3? Looks like more work to rename it but knowing windows I think it may be the better option?

  2. #2

    Join Date
    Jun 2010
    Location
    England
    Posts
    735
    Thank Post
    89
    Thanked 52 Times in 46 Posts
    Rep Power
    35
    You would be better off demoting controller 1 instead of just turning it off and then trying to clean up, but of course ensure you've transferred all the roles before you do this
    Last edited by ihaveaproblem; 25th July 2012 at 05:20 PM.

  3. #3

    Ric_'s Avatar
    Join Date
    Jun 2005
    Location
    London
    Posts
    7,590
    Thank Post
    109
    Thanked 762 Times in 593 Posts
    Rep Power
    180
    Do you really mean that you have 2 servers in a cluster and they are DCs?

  4. #4

    Join Date
    Mar 2010
    Location
    shadowx@AllEvil:/
    Posts
    222
    Thank Post
    12
    Thanked 28 Times in 25 Posts
    Rep Power
    13
    Quote Originally Posted by Ric_ View Post
    Do you really mean that you have 2 servers in a cluster and they are DCs?
    Erm...yes? O.o

    Both servers are clustered, the DHCP file system is a cluster resource between them both they have a cluster ip of xx.xx.xx.10 (respectively each server has xx.1 & xx.2 and xx.3 & xx.4

    Thanks for the tip on demoting Controller1 ihaveaproblem. Looking through google it seems we are relatively lucky to have the server operational to a degree, it seems if it was a full OS failure it would be more of a pig to fix. On linux I would just uninstall the services, clear the files and re-install but I know better than that when dealing with windows!

  5. #5

    3s-gtech's Avatar
    Join Date
    Mar 2009
    Location
    Wales
    Posts
    2,697
    Thank Post
    143
    Thanked 542 Times in 486 Posts
    Rep Power
    148
    It's not ideal, but removing a failed DC is much easier now with 2008 and 2008R2 domains than with 2003 - the tools for cleaning up are more integrated and simpler to use. Demoting is still far easier though. As for its name, new is better - if only for purging any quirks you may find with stale records, odd DNS entries etc, but those shouldn't matter if you also give it the same static IP.

  6. #6
    Beard's Avatar
    Join Date
    Jun 2012
    Location
    Haywards Heath
    Posts
    15
    Thank Post
    0
    Thanked 2 Times in 2 Posts
    Rep Power
    5
    Quote Originally Posted by Ric_ View Post
    Do you really mean that you have 2 servers in a cluster and they are DCs?
    ^^ Maybe clustering these together needs to be viewed as the original issue?

  7. #7
    Beard's Avatar
    Join Date
    Jun 2012
    Location
    Haywards Heath
    Posts
    15
    Thank Post
    0
    Thanked 2 Times in 2 Posts
    Rep Power
    5
    Also, here's a link that will tell you why clustering DCs is a bad idea. Domain Controllers as Cluster Nodes - Bad Idea - Cluster Help

  8. #8

    Join Date
    Mar 2010
    Location
    shadowx@AllEvil:/
    Posts
    222
    Thank Post
    12
    Thanked 28 Times in 25 Posts
    Rep Power
    13
    Quote Originally Posted by Beard View Post
    ^^ Maybe clustering these together needs to be viewed as the original issue?
    Go on? Just to clarify when I say the DCs are clustered I mean the fact that the DHCP and Witness disks are cluster resources as are the disks that contain home drives, file shares etc.. They are physically on a SAN but are mounted as a cluster volume on one or the other DC. The actual OS and Active Directory data is stored physically on each set of hardware, if that makes sense? Essentially the cluster is there to support shared files/folders

    When we were looking into it we couldn't see any downsides to clustering unless we missed something in which case now is a great time to tell me what we missed! If it's a massive oversight we could look at shuffling the AD around a bit.

    EDIT: Seen your link....

    From the list of downsides at least half of those would still apply if you had two separate DCs with no clustering, for example downsides such as "They should point at each other for DNS and if one node goes down the other cant resolve anything" Surely in an unclustered two DC network you hit the same problem? Our servers have two DNS records set, one to the other server and one to themselves so if the other server is down then they hit themselves for DNS.

    Perhaps I didn't explain it properly in my first post, the AD service is not clustered but the servers themselves sit in a cluster purely to share cluster volumes such as the DHCP disk and the file shares.

  9. #9

    Ric_'s Avatar
    Join Date
    Jun 2005
    Location
    London
    Posts
    7,590
    Thank Post
    109
    Thanked 762 Times in 593 Posts
    Rep Power
    180
    As the mighty @Beard says, the original problem will stem from the fact your DCs are on your cluster nodes. You gain no advantage from clustering in this scenario and it is against best practice.

    Assuming you are incredibly lucky and all goes well, do the following:

    1. Create a third DC on something and move ALL the FSMO roles to it
    2. Pray that replication works
    3. Demote your existing DCs
    4. Clean up AD when your existing DCs fail to demote properly (see Delete Failed DCs from Active Directory )
    5. Move your DHCP to the server you made in step 1
    6. Destroy the cluster
    7. Build 2 new Windows servers and get them patched up
    8. Make your two new servers DCs and make sure you have NO errors in the logs
    9. Wait for several hours and keep checking the logs on all 3 DCs
    10. Assuming you have no errors, move DHCP back to your new servers (see Balance the load on your DHCP servers by using the 80/20 rule for scopes if you want to balance load)
    11. Move ALL the FSMO roles back to your new DCs
    12. Wait a few hours, checking fir errors
    12. Demote your temporary DC created in step 1
    13. Check all logs for errors
    14. Get back to less tedious work

  10. 2 Thanks to Ric_:

    rodent43 (26th July 2012)

  11. #10

    Ric_'s Avatar
    Join Date
    Jun 2005
    Location
    London
    Posts
    7,590
    Thank Post
    109
    Thanked 762 Times in 593 Posts
    Rep Power
    180
    Quote Originally Posted by shadowx View Post
    From the list of downsides at least half of those would still apply if you had two separate DCs with no clustering, for example downsides such as "They should point at each other for DNS and if one node goes down the other cant resolve anything" Surely in an unclustered two DC network you hit the same problem? Our servers have two DNS records set, one to the other server and one to themselves so if the other server is down then they hit themselves for DNS.
    Not true... they also point at themselves.

    Also, AD is a replicated system to provide the resilience for this very reason.

  12. #11

    Join Date
    Mar 2010
    Location
    shadowx@AllEvil:/
    Posts
    222
    Thank Post
    12
    Thanked 28 Times in 25 Posts
    Rep Power
    13
    Interesting, I'm not saying I disagree with you (well, I am ) but I want to run through a few things, purely so I can learn more about clustering the AD servers and what not, it's not me calling you out or nit picking!

    10. Assuming you have no errors, move DHCP back to your new servers (see Balance the load on your DHCP servers by using the 80/20 rule for scopes if you want to balance load) From my understanding from previous research this is simply to put 80% of client IPs on one server and the other 20% on the second? In which case the best I can hope for is loosing 20% of my clients, worst case is 80% whereas currently even with one server offline I can support 100% of my clients with DHCP

    Also file servers... Currently our AD servers run: AD-DS, DHCP, DNS and file services so with your solution we would then need to build two new file servers on dedicated hardware? I dont want to use just one of anything purely for redundancy... But then with two file servers are we not likely to run into collisions with read/writes if they are both hitting one set of LUNs? With clustering we have only one instance of file services running so there is no chance of a collision but in the event of a node failure that instance is migrated, hence we retain functionality.

    Before this network we had a Novell system which is of course linux based, under linux the clustering worked perfectly for all these services so it could be that we are looking at things through Linux tinted glasses and need to take a step back, or it could be that we are going against the grain and it will ultimately work as well or better. With no other experience to draw on I can't say which side of the line I'm on but from a logical point of view clustering makes sense...

  13. #12
    Beard's Avatar
    Join Date
    Jun 2012
    Location
    Haywards Heath
    Posts
    15
    Thank Post
    0
    Thanked 2 Times in 2 Posts
    Rep Power
    5
    Quote Originally Posted by shadowx View Post
    From the list of downsides at least half of those would still apply if you had two separate DCs with no clustering, for example downsides such as "They should point at each other for DNS and if one node goes down the other cant resolve anything" Surely in an unclustered two DC network you hit the same problem? Our servers have two DNS records set, one to the other server and one to themselves so if the other server is down then they hit themselves for DNS.
    As @Ric_ has just explained AD uses replication for resilience and DCs can also point at themselves. The fact that M$ have said that it is not best practice would make me question my design.

    Also, if as you say "From the list of downsides at least half of those would still apply if you had two separate DCs with no clustering" doesn't that mean that you'd reduce the risk by 50% if you did have two separate DCs using replication?

    Anyway, the genius that is @Ric_ has provided you with a good idea of how to fix the issues you are seeing. He's good with this stuff, I'd go with what he's suggested. If you decide to go this way, open a thread if you get issues and we can help.

  14. #13

    Join Date
    Mar 2010
    Location
    shadowx@AllEvil:/
    Posts
    222
    Thank Post
    12
    Thanked 28 Times in 25 Posts
    Rep Power
    13
    The fact that M$ have said that it is not best practice would make me question my design.
    Really?! It would make me more certain that I'm following the right path! If I had my way we would still have a pure linux network with high availability clustering, 100% malware resistance and increased security and stability but the school wanted sharepoint so here we are!

    I understand that the AD is resilient in that it replicates but the AD only provides user/computer authentication realistically. It's only one part of network uptime the rest comes from the DNS and DHCP, if any one of those services were to fail then end users have no connectivity. AD and DNS both replicate but DHCP does not, the only way to achieve 100% uptime is to cluster it, or get ridiculous and have 100 DHCP servers each with 1% of the IP records on them and settle for 99% uptime (I know I' being pedantic but it illustrates the point perfectly)
    Also, if as you say "From the list of downsides at least half of those would still apply if you had two separate DCs with no clustering" doesn't that mean that you'd reduce the risk by 50% if you did have two separate DCs using replication?
    I'm confused, what I mean is most of the downsides are irrelevant since they would still exist in a non clustered environment, by clustering I don't see how we increase risk/downtime, all I see are the bonuses from having resilient DHCP and file services?

    Let's ignore what microsoft says for the time being and focus on what we all actually think/know from experience (which I will openly admit is not a lot in my case, this is my first year working with a microsoft backend), what real world consequences are there for clustering that we need to be aware of? We are more than happy to redesign the servers but only if we are confident it's the route to follow.

  15. #14

    Join Date
    Jul 2012
    Posts
    38
    Thank Post
    11
    Thanked 2 Times in 2 Posts
    Rep Power
    5
    Quote Originally Posted by Ric_ View Post
    As the mighty @Beard says, the original problem will stem from the fact your DCs are on your cluster nodes. You gain no advantage from clustering in this scenario and it is against best practice.

    Assuming you are incredibly lucky and all goes well, do the following:

    1. Create a third DC on something and move ALL the FSMO roles to it
    2. Pray that replication works
    3. Demote your existing DCs
    4. Clean up AD when your existing DCs fail to demote properly (see Delete Failed DCs from Active Directory )
    5. Move your DHCP to the server you made in step 1
    6. Destroy the cluster
    7. Build 2 new Windows servers and get them patched up
    8. Make your two new servers DCs and make sure you have NO errors in the logs
    9. Wait for several hours and keep checking the logs on all 3 DCs
    10. Assuming you have no errors, move DHCP back to your new servers (see Balance the load on your DHCP servers by using the 80/20 rule for scopes if you want to balance load)
    11. Move ALL the FSMO roles back to your new DCs
    12. Wait a few hours, checking fir errors
    12. Demote your temporary DC created in step 1
    13. Check all logs for errors
    14. Get back to less tedious work

    I think we will look at moving in this route...I do like point 14 but I am not sure there is less tedious work on our list...

    We only moved to MS last summer and a very fast pace, as mentioned by shadowx...Novell seemed to have greater stability for our style system but I think we need to think more MS...

    The main problem, as usual, was finance so we went with 2 clusters...one for hyper-v to virtual pretty much every other server we required and installed file services on the DC...which now I understand is a bad idea

    Question...do you run your file services as hardware servers?

    Thanks for all the great posts

  16. #15

    Ric_'s Avatar
    Join Date
    Jun 2005
    Location
    London
    Posts
    7,590
    Thank Post
    109
    Thanked 762 Times in 593 Posts
    Rep Power
    180
    Quote Originally Posted by shadowx View Post
    10. Assuming you have no errors, move DHCP back to your new servers (see Balance the load on your DHCP servers by using the 80/20 rule for scopes if you want to balance load) From my understanding from previous research this is simply to put 80% of client IPs on one server and the other 20% on the second? In which case the best I can hope for is loosing 20% of my clients, worst case is 80% whereas currently even with one server offline I can support 100% of my clients with DHCP
    80/20 is a standard thing... go 50/50 if you want. Either way, you don't 'lose' any clients, you just lose the ability to give out new IPs... the clients don't throw the IPs away just because the DHCP server isn't there - of course, I'm assuming you haven't used a lease time of a nanosecond. If, in your environment, DHCP is such a vital service requiring high availability you can always move it onto your virtualisation setup.

    Also file servers... Currently our AD servers run: AD-DS, DHCP, DNS and file services so with your solution we would then need to build two new file servers on dedicated hardware? I dont want to use just one of anything purely for redundancy... But then with two file servers are we not likely to run into collisions with read/writes if they are both hitting one set of LUNs? With clustering we have only one instance of file services running so there is no chance of a collision but in the event of a node failure that instance is migrated, hence we retain functionality.
    Just put the fileserver role on your virtualised setup?

    Before this network we had a Novell system which is of course linux based, under linux the clustering worked perfectly for all these services so it could be that we are looking at things through Linux tinted glasses and need to take a step back, or it could be that we are going against the grain and it will ultimately work as well or better. With no other experience to draw on I can't say which side of the line I'm on but from a logical point of view clustering makes sense...
    The services you were clustering before were different services... your issues all come from putting your DCs on clustered nodes and your replication has broken.

SHARE:
+ Post New Thread
Page 1 of 4 1234 LastLast

Similar Threads

  1. Prices for D-Link DCS-6818
    By gsk in forum Hardware
    Replies: 0
    Last Post: 18th January 2011, 11:55 AM
  2. Replies: 4
    Last Post: 5th January 2011, 03:55 PM
  3. Rebuild a DC Server (2003)
    By katem in forum How do you do....it?
    Replies: 15
    Last Post: 1st July 2008, 09:58 AM
  4. Exchange 2000: RUS for a Win2K3 DC
    By ajbritton in forum Windows
    Replies: 7
    Last Post: 30th January 2007, 06:45 PM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •