We are moving towards having dynaTrace in a central position in our production operations activities, from monitoring to alerting, etc. To make this work, we need to move towards a model where there is no single point of failure anymore, something that is difficult to do with dynaTrace 4.1.
One area I'm focused on is the dynaTrace server itself and the persistant data stores (Oracle and file-based data), looking at how to handle catastrophic The database is being replicated, within about 10-30 seconds after the data gets updated in the primary database. The dynatrace file system itself is where my questions are.
We have identified the following directories that need near-continuous replication. The other trees can be replicated with less frequency
??? - missing anything?
The question: what is dynaTrace engineering's recommendation for file-based replication? We anticipate having around 2-4Tb of data when done. We currently run on SAN, so are looking at preferentially using SRDF (block-level replication). However, I don't know if the dynatrace storage model is going to be compatible with block-level replication, since a record could span multiple blocks. The other approach is something like rsync. Are there any guidelines/recommendations you have around keeping two datacenters in sync, so in the event of failure of one data center we can spin up the secondary dt server and pick up very close to where we left off (i.e. within a minute or two)?
Answer by Günter S. ·
the following directories are crucial for the collector/server to work.
If you are running the server normally:
dynatrace-4.1.0/dtserver.ini (= startup configuration of the server)
dynatrace-4.1.0/server/cache (= runtime data, you only need to copy the .imap file, the rest is optional)
dynatrace-4.1.0/server/conf (= the whole configuration, including license, system profiles, dashboards, permission..)
If you are running the server instanced you need those directories for every instance:
If you are running the collector(s) normally:
dynatrace-4.1.0/collector.ini (= startup configuration of the collector)
dynatrace-4.1.0/collector/cache (= runtime data, same as server, only .imap are important, rest is optional)
dynatrace-4.1.0/collector/conf (= configuration data)
If you are running instanced collector(s) you need these directories for every collector:
You also need to backup your "storage" directory, the one where you are writting the PurePaths too. As for the question of block-level replication, I don't have any experience with that and I can't really help you with an answer here.
We do have another customer who's running a hot-standby server, but he's not replicating data between the two servers. In case of a failure the 2nd environment takes over and the data is written to a different storage and a different database till the problem is solved. So he'll not have all captured information in 1 dtserver, which seems to be your goal.