Skip to content

High Concept: Bypassing the LLVM back end

August 11, 2015

I have been batting my head against the complexity of the LLVM structure and build process for well over a year. It’s a brutally hard thing to get a grip on croggled with C++ classes and templates and make and llvmbuild and .td’s and just god knows what all. It has certainly been done for targets similar to the 1802 (e.g. the Z80) but it just seems to be beyond me.

The Clang C++ compiler translates C++ into an intermediate representation(ir) which is then assembled into (say) z80 or x86 machine language by the appropriate back end. A lot of the hard work and optimization has been done by Clang at that point. It occurs to me that maybe it would be possible to write a translator from the llvm intermediate representation(ir) to 1802 assembly just ignoring the whole llvm back end process. For example, the Z80 translation sequence is shown below:

GIVEN THE C++ SOURCE CODE
~/Documents/llvm/llvm-archive/llvm-z80/build> cat exwjr1.cpp
short int foo(short int a, short int b) {
    short int result = a + b;   // r0 + r1
    return result;        // r0
}

short int foo(int a) {
    int result = a;   // r0
    return result;        // r0
}

int main(){
  return foo(1,2);
}

THE CLANG COMPILER GENERATES THE IR SHOWN BELOW
>./bin/clang -cc1 -triple z80-unknown-unknown -S -O3 -o exwjr1.ll -emit-llvm exwjr1.cpp

; ModuleID = 'exwjr1.cpp'
target datalayout = "e-p:16:8:8-i8:8:8-i16:8:8-n8:16"
target triple = "z80-unknown-unknown"

; Function Attrs: nounwind readnone
define i16 @_Z3fooss(i16 %a, i16 %b) #0 {
  %1 = add nsw i16 %b, %a
  ret i16 %1
}

; Function Attrs: nounwind readnone
define i16 @_Z3fooi(i16 %a) #0 {
  ret i16 %a
}

THE Z80 BACK END TRANSLATES IT TO THE FOLLOWING Z80 ASSEMBLY
>./bin/llc -march z80 -O3 -filetype=asm exwjr1.ll

>cat exwjr1.s
	.file	"exwjr1.ll"
	.text
	.globl	_Z3fooss
	.type	_Z3fooss,@function
_Z3fooss:                               # @_Z3fooss
# BB#0:
	push	ix
	push	bc
	ld	ix, 0
	add	ix, sp
	ld	sp, ix
	ld	b, h
	ld	c, l
	ld	l, (ix+6)
	ld	h, (ix+7)
	add	hl, bc
	pop	bc
	pop	ix
	ret
.tmp0:
	.size	_Z3fooss, .tmp0-_Z3fooss

	.globl	_Z3fooi
	.type	_Z3fooi,@function
_Z3fooi:                                # @_Z3fooi
# BB#0:
	ret
.tmp1:
	.size	_Z3fooi, .tmp1-_Z3fooi

	.globl	main
	.type	main,@function
main:                                   # @main
# BB#0:
	ld	hl, 3
	ret
.tmp2:
	.size	main, .tmp2-main

So the theory would be that the final step could be done with some sort of conversion of the llvm ir into 1802 macros plus some code to analyze and insert prologue and return sequences.

Also, there’s this http://hackaday.com/2015/08/06/hacking-a-universal-assembler/ which could probably make a bad idea worse.

Advertisements

From → LLVM

2 Comments
  1. Marc Jacobi permalink

    Bit late perhaps, but does this help?
    http://llvm.org/docs/tutorial/index.html

    • Thanks, I appreciate the thought. CPU0 is an example one of the many many blind alleys i went down. There are no dates on the material(of course) but the “setting up your mac” is based on mountain lion. Unless these tutorials are kept right up to date with the tools they’re no good for someone like me.

      I have to admit though that the PDF doc shows updates to November of this year so it could be worth another go. http://jonathan2251.github.io/lbd/TutorialLLVMBackendCpu0.pdf

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: