2 Replies - 1523 Views - Last Post: 30 August 2017 - 03:45 PM

#1 blasterinblack   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 10
  • Joined: 13-August 17

Writing a "formally verified" compiler - looking for hints

Posted 14 August 2017 - 09:31 AM


I hope this question does not sound like "hey guys, I am feeling lazy, just tell me how to do it". I'm indeed spending hours of google-searching for papers, but I'm beginning to run in circles. Therfore I seek external hints.

In short: I've written a parser for a Java-like language which is expected to produce JVM bytecode.

As a second step, now I want to write the mapping into a semantic model, validate that model (type checking, scopes checking and so on).

A third step will be transforming the semantic model into JVM bytecode.

An additional requirement is that I want to implement the second and third step with formal methods.

So, to be more concise about what I want to achieve:

-Let S be my language (the one I am implementing).
-Let O be a model for OOP formal semantics (general enough, language-agnostic); let S(O) my language's formal semantics defined within O; and let JVM(O) JVM bytecode's formal semantics defined within O. Note that both S(O) and JVM(O) are defined within O.
-- [JVM bytecode -> JVM(O)] is injective, so it is reversable and I can transform models written in O into JVM bytecodes.
-- [S->S(O)] is a subfunction of [S->JVM(O)], so everything written in S source will have a meaning within JVM(O).

-Let S1 be a source program written in my language S.

Now, I am pursuing one or more frameworks that will assist me as much as possible in automatizing these three processes:

1) The formal sematics specification of my language S,
2) Writing and testing the compiler itself (including its formal verification),
3) Executing users' compilations (for a given source, generate and validate a semantics model and then a target bytecode model).

I have found frameworks (either theoretical and/or executable frameworks) that assist in each one of the steps, but I failed to find an out-of-the-box ready framework that I might use for 1, 2 and 3.

Do you happen to know one, or perhaps you can suggest a least resistance path (in terms of mandays) to implement one?

Every contribution will be appreciated!


PS: I am omitting the frameworks I have been playing with in order to first focus on the theoretical problem and make sure I am understood. Later on I can give implementation details.

Is This A Good Question/Topic? 0
  • +

Replies To: Writing a "formally verified" compiler - looking for hints

#2 ishkabible   User is offline

  • spelling expret
  • member icon

Reputation: 1747
  • View blog
  • Posts: 5,898
  • Joined: 03-August 09

Re: Writing a "formally verified" compiler - looking for hints

Posted 22 August 2017 - 10:13 PM

wThis is a big topic. There are lots and lots of things to read on the subject.

I'll start you with Amel Amhed's lectures on compiler verification from OPLSS: https://www.cs.uoreg.../curriculum.php
These sorts of methods of writing verified compiles won't fit your 3 steps but these sorts of composed steps using logical relations are the reality of verified compilers. You just keep running verified steps until you wind up with something simple enough to translate to machine code. These steps are each verified using logical relations.

However, if you just want to write a specification and then just increase your confidence (not fully verify) that the implementation holds to the specification then you have other options.

For instance, in Haskell, you could do the following to fulfill your steps by doing the following

1) Write the semantics for your source language
2) Write the semantics for your target language
3) use quickcheck and smallcheck to become highly confident that your compiler preserves the semantics between the two.

Specifying the computable semantics needed for this method won't have a library. You'll just have to do it. It shouldn't be too hard. For instance, you can use values in the language you're working in to specify denotational semantics. You can use functions acting on terms to specify small step and big step semantics.

I would likely trust my life to an implementation that took this approach seriously. If not done correctly, however, this can become intractible. For instance if random programs are allowed to be generated and then evaluated you'll get infinite loops. If done right however then due to the small scope hypothesis you can catch most issues using smallcheck. Errors that arise in more complex cases get good coverage from quickcheck.

It sounds like you want to have a denotational semantics such that the semantics of S is a subset of the semantics of JVM. Finding a semantics that will for both JVM bytecode (even a small subset) and a subset of Java will be hard though. Also the requierment that the semantics of JVM be reversible seems uneeded to me.
Was This Post Helpful? 0
  • +
  • -

#3 blasterinblack   User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 10
  • Joined: 13-August 17

Re: Writing a "formally verified" compiler - looking for hints

Posted 30 August 2017 - 03:45 PM

Hi ishkabible!

thanks a lot for you answer and sorry for answering your reply so late. I've been a bit off lately.

It's true, finding semantic meta-models fitting both JVM bytecode and my language are hard to find.

Indeed, the steps I have formulated are very arguable. That's the reason I've just decided to be less ambitious and settle by more practical, yet humble goals.

I've found a framework for semantic model definition called "the K framework". They have already formulated a complete formal, executable semantics for Java 1.4. I think I could formulate within the K framework a similar semantics model for my language and then just implement a verified transpiler from S to Java. On top of the generated Java, I can just rely on a market-proven compiler.

The "verified transpiler" can be done in a lot of easy ways once I have the source (S language) semantics and the target (Java 1.4) semantics.

The actual JVM bytecode generation from Java 1.4 would not be formally verified, but it would be market-proven by state-of-the-art compilers. In addition, Java 1.4 is close enough to a simple subset of what we call "Java" so that there is plenty of attempts of verification - who knows, maybe I can later on find something.

The haskell option is probably a better option to do that in the long term, but I don't know haskell and I don't know of any ready-to-use, complete JVM or Java semantics in Haskell.

I will have a look at those lectures, thanks!

This post has been edited by andrewsw: 30 August 2017 - 11:19 PM
Reason for edit:: Removed previous quote, just press REPLY

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1