9 Replies - 956 Views - Last Post: 19 January 2019 - 08:56 PM

Poll: Production metrics and their actionable uses (2 member(s) have cast votes)

Have you ever utilised production metrics data while developing?

  1. Yes, through an APM tool (0 votes [0.00%])

    Percentage of vote: 0.00%

  2. Yes, through an IDE plugin (0 votes [0.00%])

    Percentage of vote: 0.00%

  3. No, I haven (0 votes [0.00%])

    Percentage of vote: 0.00%

  4. No, I have never found this useful (1 votes [50.00%])

    Percentage of vote: 50.00%

  5. Yes, through other means (please comment) (1 votes [50.00%])

    Percentage of vote: 50.00%

Vote Guests cannot vote

#1 Norbo11   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 3
  • Joined: 17-January 19

Production metrics and their actionable uses

Posted 17 January 2019 - 09:58 AM

If you've ever used/heard of Application Performance Monitoring tools such as New Relic or ELK stack, you'll know that they provide you with the following features:

- Various production metrics on a deployed application, and hooks into most languages to collect said metrics
- Per-function and sometimes per-line profiling data (also hooks into web frameworks like Flask to give you stats for each of your API endpoints)
- Tracking of exceptions and error rates

However, these APM tools are largely targeted at operations/sysadmin people. There is arguably a huge amount of very useful data in there, which could aid you in your day-to-day development work. Imagine:

1. Writing some code
2. Deploying that code
3. Knowing immediately on a line-by-line basis, where your bottlenecks are
4. Seeing some exceptions unfold in your IDE, in real-time
5. Playing back through particular events which occurred on live production servers in order to debug production issues, locally, while having access to the exact inputs which were used
6. Having your IDE infer the complexity of your functions based on live production data
7. Having all of your production metrics be version-control-aware straight out-of-the-box, such that you could do all of the above with respect to any deployed revision
8. ???

In his paper "Developer Targeted Analytics: Supporting Software Development Decisions with Runtime Information", Jürgen Cito writes:

- Runtime information delivered in the form of centralized and external dashboards are not actionable for software developers
- When solving problems that have occurred at runtime, developers rather rely on beliefs and intuition than utilize metrics
- Data combinatorics in production environments are so different to what profilers and tests can simulate either locally or in staging environments

All of which are very sensible observations - or are they? This is where I need your help.

I am using his work as a basis for developing my own solution to this problem, and I thought I would reach out to some real developers, in order to help me:

1. Evaluate the importance/need/impact of a solution to this problem
2. Draw some inspiration from software developers much more experienced than myself
3. Gather real opinions on how to improve the development process as a whole, and lead to quicker turnaround times and higher quality code

I have a good set of initial ideas as to where I'd like to take this. A lot of the work will involve writing a very intuitive plugin for IntelliJ in order to make this kind of information actionable, and hopefully aid in the daily decision making process of developers.

So, with all of the above in mind, I'd like to ask the community here:

1. What are your general thoughts on all of the above?
2. What kinds of features do you wish your IDE had in order to make use of production metrics data?
3. Do you know/use any other tools which tried to achieve the same thing? How did you find that experience?


Many thanks.

For some reason, the poll got messed up :( The third option should say "No, I haven't found the right tools"

Is This A Good Question/Topic? 0
  • +

Replies To: Production metrics and their actionable uses

#2 Skydiver   User is online

  • Code herder
  • member icon

Reputation: 6773
  • View blog
  • Posts: 23,078
  • Joined: 05-May 12

Re: Production metrics and their actionable uses

Posted 17 January 2019 - 11:06 AM

Yes, I have used production metrics data while developing. The data was gathered by strategically putting in logging/telemetry calls to a proprietary library that also knows the current version of the product. We took care to not log any sensitive data (e.g. passwords, IP addresses, machine names, MAC addresses, etc.), or any personal identifying data. If the user had opted-in to send data back home to, the data is put into a data warehouse and analyzed frequency to finds trends, patterns. We could query the data to see how often something crashed and where. We could see what set of operations led to the crash or problem. If there was a crash, then a callstack of the crash is packaged up, along with a very tiny memory dump. We could see what features were used a lot, and what features didn't get used that often. Although the data was anonymized, there was enough information to be able to get larger demographic information like what countries/regions the data came from, and what industry the product was being used in.

If there is a serious crash or problem, AND a user put in support call, we could ask them to enable even more tracing options which will gather data that leads to the problem and have them upload the tracing data.

Having the more general anonymous kind of data readily available in the IDE would not really have been that useful. It was easy enough to just simply have another window open which had the results of the query I described above. More often than not, just having the stack trace was good enough. Sometimes the memory dump helps a little bit. For the less common cases, seeing the set of operations before the problem helps.

For the cases where they uploaded the extra tracing data, it's more typical to load that into the debugger and do some historical tracing that way and so, I guess in a sense, it is built into the IDE, not merely a plugin.

When product planning, we use the same data. We aren't at our IDEs doing product planning, anyway so I don't see how a plugin would help.

Personally, I don't want anything added into my IDE. CodeLens is the first thing I turn off whenever I have a fresh installation of Visual Studio. Some people may like being able to see all that CodeLens has to offer, but for me it just clutters up the view and keeps me from focusing on the code. By adding in the production telemetry into the code view is going to be more distracting rather than useful.
Was This Post Helpful? 0
  • +
  • -

#3 astonecipher   User is offline

  • Senior Systems Engineer
  • member icon

Reputation: 2769
  • View blog
  • Posts: 10,963
  • Joined: 03-December 12

Re: Production metrics and their actionable uses

Posted 17 January 2019 - 11:49 AM

Ahh product market research... Why a plug-in would be helpful I wonder what you are actually working on/ attempting to create. And being able to access production data in a dev environment is a huge no-no in every place I have worked, it interferes with everything from HIPPA concerns to security issues.
Was This Post Helpful? 0
  • +
  • -

#4 Skydiver   User is online

  • Code herder
  • member icon

Reputation: 6773
  • View blog
  • Posts: 23,078
  • Joined: 05-May 12

Re: Production metrics and their actionable uses

Posted 17 January 2019 - 03:02 PM

Personally, I think that the harder problem to solve, and the one that is ripe for a lot of profit, is to come up with an expert system that guides developers, program managers, product designers, and product owners into creating the questions that they have about their product, and then mapping those questions down to the low level telemetry as well as corresponding database schema that the application needs to capture to be able to answer those questions.

For a fully web based application where the application can be easily updated as new questions come up, it's going to be a lot easier to take an iterative approach, but for applications which are desktop apps and require bigger downloads to do an update, then more care and planning needs to be done.

Or at least for me it was difficult because the proprietary library I was using following a company policy that the telemetry packets sent back couldn't exceed 4KB in size, and packets could only be sent out once a day. Deciding on what are counters, what are flags, what are measures, and what are actual data samples was tough in a green field project. For an existing project which already had a telemetry packet design, it was deciding what telemetry we didn't need anymore to make room for new telemetry that we needed.
Was This Post Helpful? 0
  • +
  • -

#5 Norbo11   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 3
  • Joined: 17-January 19

Re: Production metrics and their actionable uses

Posted 17 January 2019 - 04:00 PM

Thank you for all of your replies.

Quote

Ahh product market research... Why a plug-in would be helpful I wonder what you are actually working on/ attempting to create. And being able to access production data in a dev environment is a huge no-no in every place I have worked, it interferes with everything from HIPPA concerns to security issues.


Perhaps I should share more about what it is I have in mind, as I may have been slightly misunderstood. This isn't product research for a company, I'm actually a student writing a final master's project on this subject and I'm not planning to sell anything for profit; instead I want to come up with something relatively novel which would have a a positive impact on the software development process.

Skydiver: The metrics which I had in mind were not any kind of user-identifying data. They are simply metrics relating to the actual code itself: profiling data and exceptions are the two most common. Privacy would not be a concern.

I am also focusing mainly on the SaaS model, where users frequently interact with a web application, which in turn interacts with a series of micro-services through well-defined APIs. As you know, such software is now often produced under the continuous delivery paradigm, where applications can change multiple times per week and be deployed immediately (move fast, break things mentality). I have a feeling that you wrote your answers with more classical desktop software in mind.

With the above in mind, the problem is this: no matter how much you test and profile your code locally, it will never give you a true picture of what happens once the code is deployed. Therefore, developers are missing out on the most important source of information: production metrics data. This means, all of your code, even code without any test coverage, is constantly being "tested" by your actual user-base which frequently interacts with your application and the services which you write for it. With each user request to any of your API endpoints, the entire trace for executing that request can be profiled (assuming low overhead), and that profile can be saved. Every time an exception is caused which causes that request to fail, that exception can also be logged, as well as its origin. Not only can we save execution times per-line, but also save things like the sizes of the collections being passed in as arguments to various functions within your service. Not only that, but we can save the exact payloads which are being sent to your service.

Now, what I am proposing, is making all of that available to the developer immediately. As soon as you deploy something, your users are interacting with your codebase in a real production setting. The proposed IDE plugin would take that data, intelligently map it over various language constructs such as lines of code. Then, by enabling a function within the plugin, you can suddenly see a colour-coded editor of different shades of red and green to represent the average running time of each line. Various exceptions which were thrown (in the past day, week, or however you want) can be shown by other visual cues. The developer did not have to test that code, or even profile it. Whenever that developer, or even anyone else checking out that part of the code want to improve it, they will have intelligent contextual information available to them right as they code.

Combine this with the fact that we can store exact arguments being passed into functions, as well as overall payloads being sent to the service, and suddenly you can look at common usage patterns and answer questions like "how do my users interact with this specific function?", or "what is the estimated time complexity of this function based on the collected running time/collection size pairs?" . This is much more powerful than knowing that "overall, this particular query took 3s on average" (like an APM tool would tell you), as the information is much more granular, and directly actionable to the developer.

Does the above make sense? Do not be afraid to completely rip this apart if you see it as infeasible or useless - I would love to hear your thoughts.
Was This Post Helpful? 0
  • +
  • -

#6 Skydiver   User is online

  • Code herder
  • member icon

Reputation: 6773
  • View blog
  • Posts: 23,078
  • Joined: 05-May 12

Re: Production metrics and their actionable uses

Posted 17 January 2019 - 04:41 PM

View PostNorbo11, on 17 January 2019 - 06:00 PM, said:

Privacy would not be a concern.

:

Combine this with the fact that we can store exact arguments being passed into functions, as well as overall payloads being sent to the service

Come again? So if my code is a the checkout code, I get to see the end-users credit cards, PII, and what things they are ordering right now?

If my code is the product details list page, I get to see which user is browsing what item right now?

If my code is doing security static code analysis, I get to see what code they have uploaded for scanning right now?
Was This Post Helpful? 0
  • +
  • -

#7 Skydiver   User is online

  • Code herder
  • member icon

Reputation: 6773
  • View blog
  • Posts: 23,078
  • Joined: 05-May 12

Re: Production metrics and their actionable uses

Posted 17 January 2019 - 06:16 PM

View PostNorbo11, on 17 January 2019 - 06:00 PM, said:

Skydiver: I have a feeling that you wrote your answers with more classical desktop software in mind.

Yes because your poll question was phrased as:

Quote

Have you ever utilised production metrics data while developing?
:
Yes, through other means (please comment)


There was not stipulation in the poll that it was only in an SaaS context.
Was This Post Helpful? 0
  • +
  • -

#8 Skydiver   User is online

  • Code herder
  • member icon

Reputation: 6773
  • View blog
  • Posts: 23,078
  • Joined: 05-May 12

Re: Production metrics and their actionable uses

Posted 17 January 2019 - 06:32 PM

View PostNorbo11, on 17 January 2019 - 06:00 PM, said:

Combine this with the fact that we can store exact arguments being passed into functions, as well as overall payloads being sent to the service, and suddenly you can look at common usage patterns and answer questions like "how do my users interact with this specific function?", or "what is the estimated time complexity of this function based on the collected running time/collection size pairs?" . This is much more powerful than knowing that "overall, this particular query took 3s on average" (like an APM tool would tell you), as the information is much more granular, and directly actionable to the developer.

So what? It's not that often that a user story, feature work, or bug report that comes along where the task is to improve the performance of a chunk of code. More often than not, the current work I have to do is to add a new feature or to fix a bug, not to improve performance because something is timing out.

Are you hoping that developers will be constantly refactoring and will see that something is failing a lot, or is running slow and so that they will refactor that bit of code along with implementing the user story or fixing the bug that the were originally tasked with? Are you hoping that developers are recognizing that technical debt is building up in some section of code and take action to reduce that technical debt now, instead of later? So will they tell their team: "You know that 3 story points I assigned to that story? Can we bump that up to 8 story points because it'll take me a while to refactor some 'bad' code paths detected by our production metrics?" (As a quick aside, you aren't supposed to change story point estimates mid-sprint -- you are supposed to suck it up and let story spill over to the next sprint, or commit the other agile sin: work over time to fit in 8 story points worth of work in 3. Bad estimates are supposed to teach the agile team to estimate better.)

Or are you hoping that when the developers are doing a code review, that these color coded bits of information will induce them to add new items to the backlog as technical debt to be paid later? If this is the case, then production metrics still isn't being used for day-to-day development. It's just being used to feed the backlog.
Was This Post Helpful? 0
  • +
  • -

#9 Norbo11   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 3
  • Joined: 17-January 19

Re: Production metrics and their actionable uses

Posted 19 January 2019 - 11:14 AM

View PostSkydiver, on 17 January 2019 - 04:41 PM, said:

View PostNorbo11, on 17 January 2019 - 06:00 PM, said:

Privacy would not be a concern.:Combine this with the fact that we can store exact arguments being passed into functions, as well as overall payloads being sent to the service
Come again? So if my code is a the checkout code, I get to see the end-users credit cards, PII, and what things they are ordering right now?If my code is the product details list page, I get to see which user is browsing what item right now?If my code is doing security static code analysis, I get to see what code they have uploaded for scanning right now?


I should not have used the word "exact arguments" here. That would certainly cause a privacy concern for certain arguments, but for other non-sensitive data, it might not. The problem would be solving the technical challenge of distinguishing between the two. But initially, my idea was to store non-sensitive information, such as the size of any collections being passed in. This would allow you to infer the time complexity of functions, for example.

View PostSkydiver, on 17 January 2019 - 06:16 PM, said:

View PostNorbo11, on 17 January 2019 - 06:00 PM, said:

Skydiver: I have a feeling that you wrote your answers with more classical desktop software in mind.
Yes because your poll question was phrased as:

Quote

Have you ever utilised production metrics data while developing?:Yes, through other means (please comment)
There was not stipulation in the poll that it was only in an SaaS context.


Fair enough, I didn't specify the SaaS context in the OP, hence I specified it in my later reply. I wasn't attacking your interpretation of the post.

View PostSkydiver, on 17 January 2019 - 06:32 PM, said:

View PostNorbo11, on 17 January 2019 - 06:00 PM, said:

Combine this with the fact that we can store exact arguments being passed into functions, as well as overall payloads being sent to the service, and suddenly you can look at common usage patterns and answer questions like "how do my users interact with this specific function?", or "what is the estimated time complexity of this function based on the collected running time/collection size pairs?" . This is much more powerful than knowing that "overall, this particular query took 3s on average" (like an APM tool would tell you), as the information is much more granular, and directly actionable to the developer.
So what? It's not that often that a user story, feature work, or bug report that comes along where the task is to improve the performance of a chunk of code. More often than not, the current work I have to do is to add a new feature or to fix a bug, not to improve performance because something is timing out.Are you hoping that developers will be constantly refactoring and will see that something is failing a lot, or is running slow and so that they will refactor that bit of code along with implementing the user story or fixing the bug that the were originally tasked with? Are you hoping that developers are recognizing that technical debt is building up in some section of code and take action to reduce that technical debt now, instead of later? So will they tell their team: "You know that 3 story points I assigned to that story? Can we bump that up to 8 story points because it'll take me a while to refactor some 'bad' code paths detected by our production metrics?" (As a quick aside, you aren't supposed to change story point estimates mid-sprint -- you are supposed to suck it up and let story spill over to the next sprint, or commit the other agile sin: work over time to fit in 8 story points worth of work in 3. Bad estimates are supposed to teach the agile team to estimate better.)Or are you hoping that when the developers are doing a code review, that these color coded bits of information will induce them to add new items to the backlog as technical debt to be paid later? If this is the case, then production metrics still isn't being used for day-to-day development. It's just being used to feed the backlog.


The thing is, you're assuming that I'm proposing some kind of irreplaceable tool that would completely revolutionise development. I am proposing an extra tool to add to a developer's toolset and I'm trying to assess the potential usefulness of that tool, in specific situations. You are not forced to have something like this on all the time. You'd switch it on when you're actively seeking the information that the tool provides.

Yes, you're not necessarily improving the performance of code ALL the time. But you sure as hell are going to do it as some point, when scaling up, or when noticing that at the 90th percentile, the user experience you're providing is sub-par. That's when you'll pull out a plugin like this and reap the benefits. You realise there's a bottleneck, you'll enable the live production-metric performance feed, and your efforts in optimisation can be guided through real data.

I like that you mention technical debt, as that has given me some potentially interesting ideas. At some point your team is going to want to pay off some accrued technical debt. How can this process be made easier and more data-driven? A common thing which developers might ask themselves is "how often is this code executed?". What if the plugin could intersect all of the traces that occur in production, in order to find the most visited parts of code, and identify such hotspots immediately? What if you were told the execution distributions of different branches in your code. Imagine you create a special branch of a function which deals with a very specific edge-case. Your intuition was that this happens 1% of the time, but actually your production metrics could tell you something different. Wouldn't that be useful? Wouldn't it make you pay more attention to that edge case?

Spitballing here: What if you could keep track of the executing machine's memory usage, along with lines of code which are being executed in production at that time - maybe this could allow you to identify spikes in memory usage and identify the exact location in the code where such memory usage spikes are being created, allowing you to diagnose such issues much more effectively?
Was This Post Helpful? 0
  • +
  • -

#10 Skydiver   User is online

  • Code herder
  • member icon

Reputation: 6773
  • View blog
  • Posts: 23,078
  • Joined: 05-May 12

Re: Production metrics and their actionable uses

Posted 19 January 2019 - 08:56 PM

I'm not assuming that what you the tool you are proposing will be an irreplaceable tool or revolutionary. I'm just saying that I'm not seeing the day-to-day value of it.

Let's say some of your developers are night owls, but most of your customers are daytime users. What walue is the live production data metrics feed to your night owl developers since they'll barely see any traffic going by?

If you say, "Okay it's not live... It's a recording of all the live data, and developers can playback any time frame they want." then that would imply that you have some central repository collecting all that data for later playback. Then how is that any different from the centralized data metrics that Cito claims is not effective.

Also, playing back data to identify problems in code is not unique. A lot of video game houses record their game play sessions -- all the user inputs from each player, frames-per-second, memory, cpu, GPU temps, network packets actually sent/recieved, current game states, etc. It make it easier to work on bug fixes by identifying a time range, and they keep playing back that time range against code in development to see why something broke or is causing bottlenecks; and once things are fixed, they still keep that data around as a future regression test. So although it's not quite the same live production data, it's pretty darn close when the developers and testers are avid players of the game they are developing.

I'm also not seeing the value of the color coding. Granted, I'm an old school developer who used to write code on paper, and on monochrome screens, so all the new fangled syntax highlighting really doesn't do much for me. Some people claim it improves their productivity. Not so for me. The only thing that I do recall that bumped up my productivity was Source Insight's version of syntax highlighting. Not only could I choose the color of keywords, and comments, I could also choose the font, font weight, and most importantly font size -- so I could make my function names in declarations be 16pt font, comments be 4pt font, and leave most of the everything else at 10pt.

I do get what you are trying to do. You are trying to help the developer see a heat map of the code, directly overlayed on the code, instead of the traditional profiler output that usually shows numbers (boring!) or histograms (better!) for functions, lines, components, etc. I commend you for trying to aid developers when they are in that performance improvement mode.

It all comes down to the intelligent integration of telemetry into the code. What you can't measure, you can't improve. And as you pointed out, it's better to be measuring what the actual usage of the code is like, as opposed to just what the developers/testers think would be the likely usage. I feel that presenting the results of all that collected data is better is a standalone tool, as opposed to trying to overlay it, or wedge it into the IDE.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1