> For example, instead of a power function that uses a loop, you could generate specialized code like x * x * x * x * x directly. This eliminates runtime overhead and creates highly optimized code.
This is misguided. For decennia now, there is no reason to assume that hand-unrolled code is faster than a for-loop. Compilers optimize this stuff, and they do this even better than mindlessly multiplying x by itself. For example, raising x to the power 6 only needs 3 multiplications, see for example: https://godbolt.org/z/Edz4jjqvv
While there are definitely use cases for meta-programming, optimization is not one of them.
Optimization is absolutely one of them, once you start dealing with higher order programs or programs with sophisticated control flow. Compiler optimizations are not enough to infer the Futamura projections in all cases, for instance.
Interesting, I never heard of Futamura projections before. Looking at the definition, it seems like the first projection (specializing an interpreter for given source code) is already handled pretty well by today's compilers, just by unrolling and inlining. And they can go even further and evaluate parts at compile time, see for example https://github.com/IoanThomas/constexpr-chip8. I can see how the second and third projections are not handled by compiler optimizations though.
The point is that compiler optimisations are a black box and not guaranteed. They can be very brittle wrt to seemingly harmless source changes (even something as simple as making an extra intermediate assignment). You are at the mercy of the 'fuel' of the optimisation passes. With staging you get to control exactly what gets inlined/partially evaluated. Of course, to get good results you need to know what to optimise for.
> With staging you get to control exactly what gets inlined/partially evaluated.
I want to stress that this is not true. Sure, sometimes it might work, but compilers can also uninline, as well as reorder the way things are evaluated. Compilers don't do a 1:1 mapping of lines of code to assembly instructions anymore; instead they are designed to take your program as input, and generate the best executable that has the same observable effect as your code. So whatever optimization you perform in the source code, it is going to be very brittle as well wrt to seemingly harmless compiler changes (like changing compiler flags, updating the compiler to a new version, and so on).
While indeed nothing is guaranteed, at this point in time the compiler is vastly better at optimizing code than humans are. If you want to make a point that multi-stage programming helps optimize code, you have to do much better than an example of raising x to some power.
I think you are missing the point a bit. With staging you can build up arbitrary levels of compile time abstractions and be sure that they will not appear in the final executable. Of course, an optimising compiler will reorder/rearrange code regardless. But it won't reintroduce all the abstraction layers that have been staged away. After enough abstraction layers, without staging even a compiler that optimises aggressively won't know to evaluate them away.
Let's put it another way: do you think there is utility in macros at all? And do you think that type safe code is better than untyped code? If you say yes to both, you must also think that staging is useful, since it basically gives you type safe macros. Now lots more things can be macros instead of runtime functions, and you don't need to deal with the ergonomic issues that macros have in other languages. For a more real world example, see Jeremy Yallop's work on fused lexing and parsing.
Does this have practical advantages over JIT runtime optimisation used in e.g the JVM? My assumption is that maybe you can generate optimised code perhaps without as much analytical overhead of the JVM or more specific optimisations (so controlled by the programmer) but honestly the details of how this might compare in practice are a bit over my head.
> For example, instead of a power function that uses a loop, you could generate specialized code like x * x * x * x * x directly. This eliminates runtime overhead and creates highly optimized code.
Could anyone explain to me how this is different from templates or parameter pack expansion in C++? I can see the constexpr-ness here is encoded in the type system and appears more composable, but I am not sure if I am missing the point.
I looked at the paper but I can't find anything related to C++.
> Could anyone explain to me how this is different from templates or parameter pack expansion in C++?
I don't think it's any different.
> I can see the constexpr-ness here is encoded in the type system
I also see they introduce new constructs like let$, so it's not just a type system thing.
> I looked at the paper but I can't find anything related to C++.
I don't think the author needs to compare their code to C++. That said, it looks to me like it is similar to the upcoming C++26's reflection capabilities.
It's an interpreter though, not a JIT. This kind of programming language thing is a bit of a hobby horse of mine, so see the comment I just posted on this for full details:
> For example, instead of a power function that uses a loop, you could generate specialized code like x * x * x * x * x directly. This eliminates runtime overhead and creates highly optimized code.
This is misguided. For decennia now, there is no reason to assume that hand-unrolled code is faster than a for-loop. Compilers optimize this stuff, and they do this even better than mindlessly multiplying x by itself. For example, raising x to the power 6 only needs 3 multiplications, see for example: https://godbolt.org/z/Edz4jjqvv
While there are definitely use cases for meta-programming, optimization is not one of them.
Optimization is absolutely one of them, once you start dealing with higher order programs or programs with sophisticated control flow. Compiler optimizations are not enough to infer the Futamura projections in all cases, for instance.
https://okmij.org/ftp/tagless-final/index.html#tagless-final
Interesting, I never heard of Futamura projections before. Looking at the definition, it seems like the first projection (specializing an interpreter for given source code) is already handled pretty well by today's compilers, just by unrolling and inlining. And they can go even further and evaluate parts at compile time, see for example https://github.com/IoanThomas/constexpr-chip8. I can see how the second and third projections are not handled by compiler optimizations though.
The point is that compiler optimisations are a black box and not guaranteed. They can be very brittle wrt to seemingly harmless source changes (even something as simple as making an extra intermediate assignment). You are at the mercy of the 'fuel' of the optimisation passes. With staging you get to control exactly what gets inlined/partially evaluated. Of course, to get good results you need to know what to optimise for.
> With staging you get to control exactly what gets inlined/partially evaluated.
I want to stress that this is not true. Sure, sometimes it might work, but compilers can also uninline, as well as reorder the way things are evaluated. Compilers don't do a 1:1 mapping of lines of code to assembly instructions anymore; instead they are designed to take your program as input, and generate the best executable that has the same observable effect as your code. So whatever optimization you perform in the source code, it is going to be very brittle as well wrt to seemingly harmless compiler changes (like changing compiler flags, updating the compiler to a new version, and so on).
While indeed nothing is guaranteed, at this point in time the compiler is vastly better at optimizing code than humans are. If you want to make a point that multi-stage programming helps optimize code, you have to do much better than an example of raising x to some power.
I think you are missing the point a bit. With staging you can build up arbitrary levels of compile time abstractions and be sure that they will not appear in the final executable. Of course, an optimising compiler will reorder/rearrange code regardless. But it won't reintroduce all the abstraction layers that have been staged away. After enough abstraction layers, without staging even a compiler that optimises aggressively won't know to evaluate them away.
Let's put it another way: do you think there is utility in macros at all? And do you think that type safe code is better than untyped code? If you say yes to both, you must also think that staging is useful, since it basically gives you type safe macros. Now lots more things can be macros instead of runtime functions, and you don't need to deal with the ergonomic issues that macros have in other languages. For a more real world example, see Jeremy Yallop's work on fused lexing and parsing.
Does this have practical advantages over JIT runtime optimisation used in e.g the JVM? My assumption is that maybe you can generate optimised code perhaps without as much analytical overhead of the JVM or more specific optimisations (so controlled by the programmer) but honestly the details of how this might compare in practice are a bit over my head.
This is fascinating. I could see it being very useful for writing SIMD abstraction layers (like Highway or SIMDe) without so much of the cruft.
> For example, instead of a power function that uses a loop, you could generate specialized code like x * x * x * x * x directly. This eliminates runtime overhead and creates highly optimized code.
Could anyone explain to me how this is different from templates or parameter pack expansion in C++? I can see the constexpr-ness here is encoded in the type system and appears more composable, but I am not sure if I am missing the point.
I looked at the paper but I can't find anything related to C++.
> Could anyone explain to me how this is different from templates or parameter pack expansion in C++?
I don't think it's any different.
> I can see the constexpr-ness here is encoded in the type system
I also see they introduce new constructs like let$, so it's not just a type system thing.
> I looked at the paper but I can't find anything related to C++.
I don't think the author needs to compare their code to C++. That said, it looks to me like it is similar to the upcoming C++26's reflection capabilities.
Typically, multistage languages permit program generation at any stage, including runtime. So that would be different than C++.
You can get pretty far in C++ though: https://codereview.stackexchange.com/questions/259045/poor-m...
This looks like a reinvention of the final, tagless interpreter, which has been a great, common technique in functional languages for awhile:
https://okmij.org/ftp/tagless-final/index.html#tagless-final
It's an interpreter though, not a JIT. This kind of programming language thing is a bit of a hobby horse of mine, so see the comment I just posted on this for full details:
https://codereview.stackexchange.com/questions/259045/poor-m...
How is this different from a syntactic macro?
Two big differences: