Safura

Safura

Secure RPC relayer to detect fraudulent activity of the destination address of transaction using machine learning model, and report to user if it is detected to be a phishing wallet.

The problem Safura solves

Problem

  • Since May 2022 $6,331,154 USD has been stolen to the phishing account on ethereum mainnet
  • 5290 victims were targets
  • Crypto phishing is growing 40% every year

Status Quo

  • People report it to etherscan or twitter, and it usually takes more than a week until the phishing account is deemed to be scam.
  • By the time the account is known to be a phishing account, it is already too late and we have too many victims who already have lost their tokens.

Solution

  • We provide a anti scam RPC relayer which hooks to eth_sendRawTransaction, and inspects the destination address of the transaction.
  • We trained a binary classifier model using ethereum fraud detection dataset
  • The destination address is queried to the etherscan to get the latest transaction history, and they are used as input to the model we trained. If it is determined to be a scam wallet, the transaction is aborted. This is effective because we always get the latest up-to-date transaction history from the blockchain, which will ensure the earliest detection of fraudulent wallet.
  • To verify the machine learning model that is public, we generate the zero knowledge proof of the model using ezkl tool, and the proof is sent to the smart wallet (Account Abstraction) contract to verify the proof on-chain.
  • If the proof is verified and the destination address is safe to transfer funds to, the transaction is finally executed from the contract wallet

Challenges we ran into

ZKML

  • The EZKL library had multiple bugs in the python bindings. We had to fix it and we intend to make some PR after the hackerthon
  • We had trouble quantizing the input, such that the inputs are available to the solidity contracts. It was mainly the documentation that caused us confusion

Machine learning

  • Since the ZK cannot handle to much bits of inputs, we could not affort to have a wide range of numbers. Otherwise the circuit will become too large to verify on chain. We used scaling technique so that the inputs are normalized to a value between [0,1)
  • RandomForestClassifier had 0.98 accuracy performance, but the ezkl lib didn't have the complicated architecture available. So we had to build a vanilla neural network with simple three fully connected layer, which resulted in 0.89 accuracy.

Web3

  • Transaction hash spec keep chaning but the web3.py did not have the latest spec standard in their library so we had to write one ourselves.
  • In order to deploy our safe smart wallet (contract wallet with zk verifier), we had to create a wallet factory that creates the contract wallet for user
  • We used the The Graph protocol to query the contract wallet address from the wallet factory contract.

Discussion