static analysis: It would be good to statically prevent certain code patterns (such as stack-allocated nsCOMPtr) or behave differently than C++ normally would (change the behavior of a stack-allocated nsCOMArray versus a heap-allocated one);
We are hoping to use trace-based optimization to speed JavaScript. If the C++ frontend shares a common runtime with the JS frontend these optimizations could occur across JS/C++ language boundaries. The Moz2 team brainstormed using a C++ frontend that would produce Tamarin bytecode but Tamarin bytecode doesn’t really have the primitives for operating on binary objects.
I don’t know. Some wild speculation is below. Please don’t take it as anything more than a brainstorm informed by a little bit of IRC conversation.
is a project implementing a low-level bytecode format with strong type annotations and code generator optimizer and compiler (static and JIT) to operate on this bytecode. There is a tool which uses the GCC frontend to compile C/C++ code into LLVM bytecode. It is already possible to use the llvm-gcc4 frontend to compile and run mozilla; it compiles the Mozilla codebase to an intermediate LLVM form and from there into standard binary objects. We would not invent an entirely new language: rather we would take the GCC frontend and gradually integrate features as needed by Mozilla.
The most pressing problem from my perspective is that using the G++ frontend requires compiling Mozilla with a mingw-llvm toolchain on Windows. The gcc toolchain on Windows does not use vtables which are MS-COM compatible which means that Mozilla code which uses or implements MS-COM interfaces will fail. In addition we would be tied to the mingw win32api libraries which are not the supported Microsoft SDKs and may not be always up to date because they are clean-room reverse-engineered headers and libraries. This mainly affects the accessibility code which makes extensive use of MS COM and ATL.
Is this a silly exercise? Would we be spending way too much time on the language and GCC and LLVM and whatnot and not enough on our codebase? Are there other ways to modernize our codebase gradually but effectively? Is LLVM or a custom language something we should consider for mozilla2 or keep in mind for later releases?
Some other problems you didn’t mention — teaching developers our own pet language adding the burden of downloading/learning new tools to the contributor bar dealing with the fact that gcc produces much worse code than MSVC.
I admit it’s awfully tempting to go down the new-language path but we cannot go down it alone. I would rather have our requirements inform the design of some new language that stands a chance of broad support and adoption.
not really sure on the definition of this one so I won’t comment,all strings are UTF,GC built-in but compacting might be tricky with the current version,I’ve seen D’s exceptions integrated seamlessly with Python exceptions and vice-versa,a few of Robert’s suggestions are there,no static analysis from within the compiler unless you’re doing static checks on template arguments or want to wait for AST macros,no idea
On the C++ advantages side. D has been shown in benchmarks to perform close to C and C++ (you can write code that’s nearly identical between all three if you’re careful.) Sadly. D does not have large portions of Mozilla written in it; a bug that is sadly not very high on Walter’s TODO list I’m afraid.
Other comments on your post: there is a D compiler that outputs LLVM bytecode in the works and DMD has native support for Win32 COM. The current experimental 2 x branch also has support for linking to some C++ constructs (global functions and virtual member functions of classes with single inheritance) as well as anything with C linkage.
If nothing else we would *really* appreciate your input as someone who is seeing the limits of the C++ language and what things you think are important in a successor.
There’s a sub project at LLVM called clang led by Apple to develop a standalone C/C++/ObjC front-end to LLVM by-passing any need for gcc. It is still in early stages (incomplete C++) but development is at full swing and can certainly benefit from additional support. See
IMO. I think mozilla definitely needs a language upgrade for the future. The XPCOM C++ macro hackery really drains you. A language with syntax like Java/C#/D that understands XPCOM would go a long way in alleviating developer stress :p
If the usual rules apply only 20% of the code needs to be super efficient. The rest needs to be easily maintained (no more pointer foo) and more easily accessible to outside developers. ES4 fits the bill. Any tools libraries etc developed would also have a direct benefit for web development. One million loc * 20% = a hell of lot smaller nightmare. I just bet that experience with the Tamarin engine will boost efficiency - possibly even to the point where C++ could be replaced completely. If the byte code needs additions then add them.
Switching to a language like D would be a poor move in the long run. If you really wanted to be semi-future proof I would opt for Scala. 80-core processors are already in the Intel fabs. The C++ threading model is totally unsuited to that sort of machine. It is a good bet that ES4 suffers from the same GIL problems as Python. Any browser that could properly utilize such a machine would smoke its competitors.
I think ROC hit it on the head… having a custom pet language decreases maintainability and access to good tools rather than increasing it. I think that is the opposite from where you want to go.
Could you just ultra-modernize the style of C++ used moving further away from C and taking more advantage of templates more references instead of pointers and take advantage of the Boost library? I like the idea of leveraging a well regarded and understood library like Boost because that way you are working with a wide spectrum of others to fill in the missing language features with a library that is broadly understood by many developers. I think it would be nice to remove as many custom type aliases as you can leverage stuff like C99 and templates more and generally work to make the code look as much like other generic C/C++ code as you can thus increasing the approachability.
I do like the idea of leveraging the good work in LLVM though! How hard would it be to adapt to writing modern C++ that could be compiled similarly to Managed C++ targeting LLVM or even something like the Java runtime. Parrot. Mono. CLR etc? Maybe try to write C++ that is closer to Java/C#/Python/whatever than it is to C?
Once upon a time when Netscape first open sourced the code I looked into helping a project port it to Java - that project quickly died once we saw the state of the code and digested what exactly would be involved with such an undertaking. Though it sounds like with this automated rewriting effort it may not have to be that way?
Right now is the perfect time to raise the discussion. (Actually six months ago would have been better but…) There’s no reason a separate NEW_LANGUAGE_BRANCH of the Mozilla 2 repository couldn’t be created from the base and experimented with for a while.
Peter: you cannot identify 20% of the code as the performance-critical stuff and convert the rest to JS. For one thing which code is important for performance varies by Web page. For another thing separating performance-critical from not-performance-critical code into separate modules is impossible. For a third thing the cost of crossing between languages would kill performance and bloat the code.
Whatever direction and whichever language you choose (and I think moving away from C++ is a step in the right direction) it should compile to some kind of intermediate byte langauge like for example LLVM. That enables garbage collection and language independency and can in the long run make replacing individual modules with new ones written in a completely different language not only possible but probably even easy.
As to separating the codebase into 80% non-efficient-easy-to-maintain and 20% highly-efficient; I don’t know the Mozilla code base well enough to assert how feasible or possible this is. But I think there must be some pretty obvious modules that could easilly be written in another language without hurting performance and without effecting rendering time of a web page at all. Everything in “about:config” all chrome menus options bookmarking code etc. I’m not saying that you could ever reach anywhere near 80% non-C++ code but moving to an intermediate byte-compiling framework enables this transition; a transition I think should be made softly gently and on a module-to-module basis.
The Mozilla is already split into two (primary) languages. Let us not introduce Yet Another Language. The Javascript part will be addressed by Tamarin and the strong advantage is the easy cross-over between mozilla front-end extensions and web-applications (as shown by Prism). The C++ part can be gradually improved (the code-rewriting effort to introduce garbage collection exceptions etc) but also minimized (moving more code to Javascript) focussing the C++ part on the really performance critical aspects (parsing scanners image handling network/cache etc).
Inventing maintaining and promoting a new language sounds like a tremendous time sink. I agree with Peter: if we’re creating a new language (ES4) anyway why not use it? Migrating code incrementally to ES4 would enable measurement of the performance impact of each change in various test scenarios. It would avoid the risk inherent in a “boil the ocean” approach of the type you suggest. And it would allow us to feast on Brendan’s tasty dogfood which will have a huge impact in jumpstarting the ES4 ecosystem. Some portion of the codebase will remain in C++ for the forseeable future but in the really long term continued improvement in CPU performance and JIT compilation techniques make it plausible that only a small body of rarely touched code (if any) could not be migrated.
Writing a new compiler has too many drawbacks to be feasible. I think many of the problems in the Mozilla codebase stem from poor/obsolete design decisions and code style. While some of these can be fixed in the source some fixes can reduce code readability so it would be interesting to see how feasible build-time code rewriting could solve some of these problems.
Poor memory safety is a problem but proper abstractions and interface design should alleviate most of the typical security problems as well as the usual array index out of bounds problems. There are also compiler flags (at least for vc8) that can check some of these accesses.
Some UTF string support can be added through a better string class and some compilers have language extensions to help. Strings need an overhaul for Mozilla2 anyways…
C++ is one of the few mainstream languages left that gives nearly absolute freedom to memory allocation and management. Integrating two different memory management systems is going to be painful no matter what language is being used. Designing a new language for the purpose of integrating with MMgc is overkill.
Could cross-language exception handling be tackled by with automated code rewriting to explicitly marshal exceptions? I think boost::python does something similar to this.
Many of the features that roc has proposed do require a new language to work. This means a new compiler but that compiler does not necessarily have to target LLVM or Tamarin. Rather it could generate C++ code but code that is provably safe (or least containing bounds checks and compiler hints). This has the advantages of maintaining compatibility with MS-COM. ATL and the Platform SDK. As long as code could be replaced on a per-module per-file or (even better) per-class basis this seems like a gentle transition.
Some of the static analysis (divergent behavior for stack/heap allocated objects of the same class) could be handled by the aforementioned build-time rewriting.
Mozilla has already reinvented the wheel on so many things. Can’t you at least *look* at other languages that already exist before you start considering making a new one?
Switching to a different language isn’t crazy. If all the currently existing languages suck too much thinking about making a new one isn’t crazy. But to just immediately start thinking about making a whole new language just for Mozilla without first looking at some of the tons of languages that are already working and usable that is crazy.
Forex Groups - Tips on Trading
Related article:
http://benjamin.smedbergs.us/blog/2007-11-05/what-if/
comments | Add comment | Report as Spam
|