Hello all,
My client has asked me a question of which I don't know how to answer:
"We have a tap between a server and client.
Why would the number between the (Server to AMD) and the (AMD to Client) not be the same?
If anything, Server to Client should be higher?"
I don't understand where the AMD fits in this equation. Can someone help clarify?
Server packets lost (server to AMD)
The number packets sent by a server that were lost - between the server and the AMD - and needed to be retransmitted.
Server packets lost (AMD to client)
The number of packets sent by a server that were lost - between the AMD and the client - and needed to be retransmitted.
Answer by Ulf T. · Nov 09, 2015 at 11:46 PM
Hey Brett - Did you digest Matt's excellent description? If not - let me try to put it in terms the customer might understand:
"We have a tap between a server and client. (Very good)
Why would the number between the (Server to AMD) and the (AMD to Client) not be the same?
Simply because it´s different traffic - The packets can be lost before the AMD as well as after. We know the total number of lost packets and we know the number of packets that has past the TAP/AMD - also what packets (ACK and retrans). The difference between those 2 is what is lost between the server and the AMD. The decode keeps track of the sequence numbers and so knows when a packets have gone missing.
If anything, Server to Client should be higher?" Why - packets can be lost in both directions?
BTW - you graph is missing "Client to AMD" to make the picture complete.
So from the top of my head - there should be 6 packet loss metrics
The nice thing with a (cleverly) placed tap is that you get an instant Fault Domain Isolation so you know whether its in the DC or out there on the Net that the packets are lost.
Thank you for your detailed response. Yes, those 6 perspectives all make sense and do paint a great picture of where the loss is happening. These metrics are much more useful than I imagined :)
Answer by Matt L. · Nov 09, 2015 at 08:08 PM
Hey Brett,
My assumption (need validation from someone else please, @Chris Vidler, @Mike Hicks) for these metrics is that the Server Loss Rate (Server to AMD) is due to packets lost between the server transmission, and the AMD sniffing point, and that the Server Loss Rate (AMD to Client) is due to packets lost between the AMD sniffing point and the Client receiving. Then the Total Server Loss Rate is simply the sum of these two metrics:
This is possible, as the AMD is in the middle, so it can see what packets got to it's sniffing point in the first transmission, vs what the client ends up requesting as retransmissions. then it can take the differences to figure out the two values.
Simplified ie: AMD sees 7 packets go past it from the server. It then sees ack from Client saying it received 5 packets of the original 10 the server says it sent. In this example, Server Loss Rate (Server - AMD) is 3pkts (10 - 7), the Server Loss Rate (AMD-Client) is 2pkts (7 - 5), and Total Server Loss Rate is 5pkts (3 + 2, or the retransmission request from Client).
This is useful, as it now tells us where the majority of the loss rate is taking place. Assuming we know the position of the AMD sniffing point on the network, using your datapoint, we can safely assume that the majority (14.3k Pkts) of the total Server-side loss rate (14.3k + 9.8k = 24.1k Pkts) is due to something on the network between the Server and the AMD sniffing point.
IT's just another way for us to break down the communication between server and client a bit more, due to the positioning of our AMD.
Thanks,
Matt Lewis
Just verified this by adding the three Server Loss Rate metrics in a DMI report:
Thank you for the great explanation, that makes perfect sense! I wasn't thinking about the AMD seeing the AMD seeing the retransmissions and inferring where the loss happened.
Answer by Jeroen H. · Dec 21, 2015 at 09:36 PM
Hi,
I have some sample capture files available from a customer who has multiple capture points and AMDs in their environment. Some application flows happen to pass more than 1 of these capturing points, so I had the chance to capture traffic simultaneaously on both ends of the network loss. The actual loss is happening on a switch in between the 2 capture points.
There's 1 trace file where a server packet is lost, and in the other file (capture point closest to the server) this packet is still there.
These two samples can help you interpret how a tool like wireshark reacts to both situations, as it uses different colorings for the retransmitted packets.
Learn the enhanced capabilities of the next generation Dynatrace AI root cause analysis and how to feed it with your own data sources.
Wednesday, February 20, 2019
Register today!
Learn the enhanced capabilities of the next generation Dynatrace AI root cause analysis and how to feed it with your own data sources.
Wednesday, February 20, 2019
Register today!
Would you like to have an early taste of what we have cooked up for 2019? We would love to hear your feedback and improve some of the new features. Check NAM 2019 Beta release notes.
Sign up today!
AMD monitoring with cisco FEX and vPC 1 Answer
Exclude Exception Class 2 Answers
Miscellaneous Parameters 2 Answers
Several Report Questions 5 Answers
Using "OR" Logic in DMI Metrics 1 Answer